Looks like these only fail on WPT.fyi, not in our infrastructure. Looks like very very subtle differences due to float/double precision when combining multiple transforms.
<rdar://problem/114542837>
We could potentially use float precision instead of double, which would hopefully cause these to end up identical. Alternatively, we could use precomputed sin/cos values for 'common' degree values. We could also just fuzz the tests (more), since the rendering is visually identical.