For completeness I built it with gcc-14 on Apple silicon and got the following results:It isn't the result I expected -- I imagined 16-wide would be slower than 8-wide on CPUs that lacked 16-wide in silicon. I should repeat this test on Linux, and pit gcc and clang against each other to see how their optimisations fare.
Code:
~/Builds/quadbike-2.0.1/src/quadbike_gcc_none -! -v Castle\ Quest.aiff 31.53s user 0.29s system 99% cpu 31.992 total~/Builds/quadbike-2.0.1/src/quadbike_gcc_sse -! -v Castle\ Quest.aiff 11.61s user 0.23s system 97% cpu 12.167 total~/Builds/quadbike-2.0.1/src/quadbike_gcc_avx2 -! -v Castle\ Quest.aiff 14.78s user 0.25s system 97% cpu 15.388 total~/Builds/quadbike-2.0.1/src/quadbike_gcc_avx512 -! -v Castle\ Quest.aiff 11.29s user 0.23s system 97% cpu 11.847 total
Statistics: Posted by Sazhen86 — Sun Sep 15, 2024 10:07 pm