Quote:
@Brian: The relevant settings are /QaxW /QxW
Which should mean that there's no dispatching - the required and used settings are identical. And /QaxW differs from /QaxN because the N version checks (according to Intel anyway) that you're running on an actual Intel CPU, while /QaxW supposedly also accepts compatible CPU's. But since there should be no dispatching there in the first place this should not make any difference.
I've also never noticed CPU dispatching code when browsing through the generated Assembly files or when running performance analysis programs.
O, and before I used these settings and generated separate code paths (generic and SSE2, or SSE and SSE2), the performance was a lot worse because in some cases the compiler somehow didn't "see" that the SSE2 version would be faster and didn't generate it. That behavior is also gone now, which also seems to indicate that the dispatching is gone.
According to what I'm reading, you have to use BOTH of those switches. The /QaxW alone would generate SSE2 and a default code path that was 386. Per Intel's documentation here
http://software.intel.com/en-us/article ... -overview/ under "Recommendations", the /QxW in combination forces the default switch path to be SSE2 rather than 386 (the default for version 10.1 of the compiler for 32-bit compilation).
Edit: Re-reading the Intel documentation that's on that page, the wording is phrased from the point of view of compiler version 11 or higher. Version 11 is when the /Qax option began to generate the default path as SSE2. If you upgraded to version 11 or 12 of the compiler, you wouldn't need to force the default to SSE2.
I'll check a little more after I put supper in the oven, but if that holds true, then there shouldn't be an AMD penalty.
Edit 2: (Both edits made after supper, but anyway)
If it isn't too much trouble, could you compile a DSP version with /QaxN /QxN and let me see if it 1) Runs and 2) might be better on Athlon64? From what I recall with SETI@Home, the N option did work better.