Quote:
Quote:
I am again very confused. Why would you want to run the non-IPP code at SSE3, but have the IPP code be SSE2 or lower? That's not a proper test, in my opinion.
I have tested both, separately. Neither had any measurable effect on the CPU load.
Separate tests do not provide end-to-end SSE3 code. Based on your description, BETA053A does not have end-to-end SSE3 support.
To summarize:
- You have an Intel system, not an AMD system.
- Intel systems do not have the potential for sub-optimal codepath generation unless the person doing the compile instructs a lower instruction set intentionally.
- AMD systems have the potential for sub-optimal codepath generation despite what the person doing the compile does in numerous situations.
- If 43% of the time is spent in IPP libraries, and you made the non-IPP code as SSE3, but IPP code as FPU, MMX, SSE, or SSE2, then you absolutely do not have a beginning-to-end SSE3 code path, thus it is an invalid test case scenario.
- The performance that you will see in a mixed SSE3 and other instruction set that is of a lower technology level will likely be weighted towards the lower technology level.
As you don't have an AMD system, you have no definitive way to tell what does or does not happen on an AMD system. You cannot infer that because your Intel system doesn't see any difference that all processors wouldn't see a difference. You cannot rely on reports back from users for BETA053A, as again, the performance in a mixed environment would likely be weighted towards the lower technology level - the "lowest common denominator".
I know you're busy, and I'm sure you want to focus on revenue-generating / competing feature activities, but could you please just let me and those of us on AMD systems *TRY* what I suggested? I mean make a serious and genuine effort to make sure that SSE3 gets utilized? In fact, even if SSE2 gets utilized.
If it doesn't help at all, then the subject can be dropped, but it needs to be a serious effort, not a mixed-case situation.