WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
NEW
256825
[SOUP] HTML pages with broken or missing content type not displayed as HTML
https://bugs.webkit.org/show_bug.cgi?id=256825
Summary
[SOUP] HTML pages with broken or missing content type not displayed as HTML
Guilaume Ayoub
Reported
2023-05-15 20:55:25 PDT
Some websites from Radio France are not displayed as HTML, but as plain text files. They work on other browsers (tested with Chrome and Firefox). They used to work with the same version of WebKitGTK, something has probably changed on the server. Maybe there’s something wrong with the content-type header (text/html, text/html; charset=UTF-8)? For example:
https://www.radiofrance.fr/franceinter
Attachments
browser and devtools for France inter
(1.00 MB, image/png)
2023-05-15 23:01 PDT
,
Karl Dubost
no flags
Details
View All
Add attachment
proposed patch, testcase, etc.
Karl Dubost
Comment 1
2023-05-15 22:30:35 PDT
Guillaume what was your userAgent string?
Guilaume Ayoub
Comment 2
2023-05-15 22:39:19 PDT
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.4 Safari/605.1.15
Karl Dubost
Comment 3
2023-05-15 23:01:52 PDT
Created
attachment 466359
[details]
browser and devtools for France inter Pas de chance. I can't reproduce with Guillaume's provided UA string on a macOS MacBook. No extensions. clean profile.
Guilaume Ayoub
Comment 4
2023-05-16 00:25:41 PDT
(In reply to Karl Dubost from
comment #3
)
> Created
attachment 466359
[details]
> browser and devtools for France inter > > Pas de chance.
:)
> I can't reproduce with Guillaume's provided UA string on a macOS MacBook. > No extensions. clean profile.
I have the problem with Epiphany. Note that for me, in the web inspector’s network tag, the page’s mimetype is "text/plain" (but the content is HTML).
Michael Catanzaro
Comment 5
2023-05-16 05:35:30 PDT
(In reply to Guilaume Ayoub from
comment #0
)
> Maybe there’s something wrong with the content-type > header (text/html, text/html; charset=UTF-8)?
Seems like a very good guess. That looks pretty messed up. I'm not sure what we should do about it though. I guess we could fall back to text/html if the page does not have any valid content type, but I'm not sure if that's actually safe to do. Depends on how other browsers behave. We don't want to start processing text files as HTML, for example. Another page with a similar problem is
https://doctors.bjc.org/wlp2/bjc/doctors/search
but this one apparently just doesn't have any content type header at all.
Karl Dubost
Comment 6
2023-05-16 05:44:10 PDT
Content-Type: text/html, text/html; charset=UTF-8 First of all, they should be contacted. Instead of a fallback, maybe it's possible to do a Quirk for this specific case. I just realized that Safari is receiving the same thing. This is a typical bug for
https://webcompat.com/
Firefox is also receiving the same bogus Content-Type. So it's at least not based on User-Agent sniffing. If the bug is opened on webcompat.com aka (
https://github.com/webcompat/web-bugs/
) it will be easier to contact people on radiofrance
https://github.com/orgs/radiofrance/people
Let's try first on Mastodon. I will send a message.
Guilaume Ayoub
Comment 7
2023-05-16 05:53:18 PDT
(In reply to Karl Dubost from
comment #6
)
> Let's try first on Mastodon. I will send a message.
OK. Thanks!
> If the bug is opened on webcompat.com aka > (
https://github.com/webcompat/web-bugs/
) > it will be easier to contact people on radiofrance >
https://github.com/orgs/radiofrance/people
Can I do this now, or should we wait for an answer on Mastodon?
Karl Dubost
Comment 8
2023-05-16 05:56:09 PDT
First attempt
https://mastodon.cloud/@Karlcow/110378453711258132
Karl Dubost
Comment 9
2023-05-16 05:56:49 PDT
It can still be opened on webcompat.com And I can continue there to contact radiofrance.
Guilaume Ayoub
Comment 10
2023-05-16 06:03:57 PDT
(In reply to Karl Dubost from
comment #9
)
> It can still be opened on webcompat.com And I can continue there to contact > radiofrance.
Reported here:
https://github.com/webcompat/web-bugs/issues/122352
Youenn Piolet
Comment 11
2023-05-16 08:42:29 PDT
Salut Karl, salut Guillaume, Thanks for the notification, the dev teams have been made aware and we can reproduce. I'm on holiday but I think a fix is on it's way :) Cheers,
Youenn Piolet
Comment 12
2023-05-16 09:10:31 PDT
I should be fixed now :) Thanks again for the notification
Michael Catanzaro
Comment 13
2023-05-16 10:08:09 PDT
Thanks Youenn! I'd like to keep this bug open though, since to be web compatible we'll need to figure out what to do about other pages with this problem. I've retitled the bug to reflect that Radio France is no longer broken.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug