254021 – Compression Streams not handling large outputs during the flush stage

RESOLVED FIXED254021

Compression Streams not handling large outputs during the flush stage

https://bugs.webkit.org/show_bug.cgi?id=254021

Summary Compression Streams not handling large outputs during the flush stage

bzugmeyer@gmail.com

Reported 2023-03-16 07:17:08 PDT

I am testing the new `CompressionStream` API in Safari Technology Preview. When trying to compress a moderately large buffer, the data it produces seems invalid. To reproduce: Compress a large enough buffer of non-repeating data. In the following example, I use a buffer of approximately 48kB. Functions are coming from the WebKit CompressionStream test suite[1]: const input = new TextEncoder().encode( JSON.stringify(Array.from({ length: 10_000 }, (_, i) => i)) ); const output = await compressArrayBuffer(input, "deflate"); assert_array_equals(input, pako.inflate(output)); Expected: `pako.inflate` returns a buffer equal to the input buffer. Actual: The decompression fails with `pako.inflate` returning `undefined`. Notes: With this particular example, I noticed that it works correctly when compressing up to 35_578 bytes. The issue only occurs when I try to compress more bytes (>= 35_579). Please see this page[2] for a practical way to reproduce the issue. In Chrome, every test succeed, but in Safari tests with larger inputs are failing. [1]: https://github.com/WebKit/WebKit/blob/20329b62061b40d5a423a1d75b67779945b84729/LayoutTests/imported/w3c/web-platform-tests/compression/compression-stream.tentative.any.js [2]: https://safari-compressionstream-issue.benoitzugmeyer.repl.co

Attachments
Add attachment proposed patch, testcase, etc.

Alexey Proskuryakov

Comment 1 2023-03-16 08:59:08 PDT

Hopefully this is a dupe of bug 252474.

Radar WebKit Bug Importer

Comment 2 2023-03-23 07:18:14 PDT

<rdar://problem/107133345>

Brandon

Comment 3 2023-03-23 11:41:36 PDT

Thanks for the bug report I can reproduce in Safari, Safari Technology Preview, and MiniBrowser. This is not a dupe of the other one from what I can tell at first glance. <!DOCTYPE html> <html> <script type="module"> import * as pako from "https://cdnjs.cloudflare.com/ajax/libs/pako/2.1.0/pako.esm.mjs"; import * as fflate from "https://unpkg.com/fflate@0.7.4/esm/browser.js"; // JSON-encoded array of 10 thousands numbers ("[0,1,2,...]"). This produces 48_891 bytes of data. const fullData = new TextEncoder().encode( JSON.stringify(Array.from({ length: 10_000 }, (_, i) => i)) ); await test(10) await test(1_000) await test(10_000) await test(30_000) await test(40_000) // fails await test(35_578) // succeeds await test(35_579) // fails async function test(bytesLength) { const data = fullData.subarray(0, bytesLength) const compressedData = await compressArrayBuffer(data, "deflate"); // Decompress with pako, and check that we got the same result as our original string // Similar to https://github.com/WebKit/WebKit/blob/20329b62061b40d5a423a1d75b67779945b84729/LayoutTests/imported/w3c/web-platform-tests/compression/compression-stream.tentative.any.js#L54-L55 try { assert_array_equals(data, pako.inflate(compressedData)); console.log(`[pako] Succeeded with ${bytesLength} bytes`) } catch (error) { console.error(`[pako] Failed with ${bytesLength} bytes:`, error) } // Double check with another library try { assert_array_equals(data, fflate.unzlibSync(compressedData)); console.log(`[fflate] Succeeded with ${bytesLength} bytes`) } catch (error) { console.error(`[fflate] Failed with ${bytesLength} bytes:`, error) } try { assert_array_equals(data, await decompress(compressedData)); console.log(`deflate succeed with ${bytesLength} bytes`); } catch (error) { console.error(`deflate Failed with ${bytesLength} bytes`, error); } } async function compressArrayBuffer(input, format) { const cs = new CompressionStream(format); const writer = cs.writable.getWriter(); writer.write(input); const closePromise = writer.close(); const out = []; const reader = cs.readable.getReader(); let totalSize = 0; while (true) { const { value, done } = await reader.read(); if (done) break; out.push(value); totalSize += value.byteLength; } await closePromise; const concatenated = new Uint8Array(totalSize); let offset = 0; for (const array of out) { concatenated.set(array, offset); offset += array.byteLength; } return concatenated; } function assert_array_equals(a, b) { if (!a) { throw new Error(`Arrays not equal: a is falsy (${a})`) } if (!b) { throw new Error(`Arrays not equal: b is falsy (${b})`) } if (!a.length === b.length) { throw new Error(`Arrays not equal: a.length !== b.length (${a.length} !== ${b.length})`) } a.forEach((v, i) => { if (a[i] !== b[i]) { throw new Error(`Arrays not equal: a[${i}] !== b[${i}] (${a[i]} !== ${b[i]})`) } }) } async function concatenateStream(readableStream) { const reader = readableStream.getReader(); let totalSize = 0; const buffers = []; while (true) { const { value, done } = await reader.read(); if (done) { break; } buffers.push(value); totalSize += value.byteLength; } reader.releaseLock(); const concatenated = new Uint8Array(totalSize); let offset = 0; for (const buffer of buffers) { concatenated.set(buffer, offset); offset += buffer.byteLength; } return concatenated; } async function decompress(view) { const ds = new DecompressionStream('deflate'); const writer = ds.writable.getWriter(); writer.write(view); writer.close(); return await concatenateStream(ds.readable); } </script> </html>

Brandon

Comment 4 2023-05-10 23:14:49 PDT

So what was happening here is that we missed an edge case where during the flush step we may have data longer than the allocated output. Since the avail_in was set to 0 we would just exit. We need to verify that the stream has ended before exiting. The reason the 35_579 size failed was the allocated output during the flush step was 16384, while the output data was just slightly larger. The first compression outputted only 2 bytes, while the remaining 13686 were only spit out during the flush step. This fixes it. @@ -132,8 +131,8 @@ ExceptionOr<RefPtr<JSC::ArrayBuffer>> CompressionStreamEncoder::compress(const u result = deflate(&m_zstream, (m_finish) ? Z_FINISH : Z_NO_FLUSH); if (result != Z_OK && result != Z_STREAM_END && result != Z_BUF_ERROR) return Exception { TypeError, "Failed to compress data."_s }; - - if (!m_zstream.avail_in) { + + if (!m_zstream.avail_in && (!m_finish || (m_finish && result == Z_STREAM_END))) { shouldCompress = false; output.resize(allocateSize - m_zstream.avail_out); } Working on the PR now.

Brandon

Comment 5 2023-05-10 23:18:29 PDT

Pull request: https://github.com/WebKit/WebKit/pull/13741

EWS

Comment 6 2023-05-11 20:07:13 PDT

Committed 263997@main (a30d9284f5b9): <https://commits.webkit.org/263997@main> Reviewed commits have been landed. Closing PR #13741 and removing active labels.

Brent Fulgham

Comment 7 2023-05-17 13:35:17 PDT

*** Bug 256330 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version Safari Technology Preview

Hardware Mac (Apple Silicon)

OS macOS 12

Product WebKit

Component WebCore Misc.

Assignee

Brandon

Reported

2023-03-16 07:17 PDT

Modified

2023-05-17 13:35 PDT History

CC List

5 users Show

URL

Keywords InRadar

Duplicates (1)

256330 View as bug list

Depends on

Blocks