Quote:
Problem right now is that my test system is unreliable, but I need to to figure out if the AVX2 version is faster than AVX.

and if there are other paths that might also be faster on E2220's etc. (SSE3, SSE4.1, ...).
For the tests with SSE, actually Wolfdale based Core 2 Duo's (e.g. E8x00 series) or Penryn based Core 2 Quads (e.g. Q9x50) would be better as they make use of SSE 4.1. The Q9550 for instance consists of two E8300 units on the same die. These processors are the last from the Core 2 generation, just before Nehalem - 1-st gen i-architecture was released (only then, with Nehalem, SSE 4.2 was introduced). As far as I remember Bojcha has E8400. Mobile T8300 (or similar) would be good as well.
E2220 would make use only up to supplemental SSE3, similarly to E6400, Q6600 and all Conroe dual core/Kentsfield quad core based CPUs.
With first gen i-series, such as i7 920 on the other hand, a full set of SSE instructions (including SSE 4.2) can be checked (without AVX), similarly to mobile i3-370m. The latter I can test if my Girlfriend allows me to
So to put it straight, as far as Intel goes:
1-st gen Core 2 Conroe/Kentsfield (and their mobile versions) - SSE 1, SSE 2, SSE 3, SSSE 3
2-nd gen Core 2 Wolfdale/Penryn (and their mobile versions) - SSE 1, SSE 2, SSE 3, SSSE 3,
SSE 4.1
1-st gen i-series - SSE 1, SSE 2, SSE 3, SSSE 3,
SSE 4.1,
SSE 4.2
2-nd/ 3-rd gen i-series - SSE 1, SSE 2, SSE 3, SSSE 3,
SSE 4.1,
SSE 4.2,
AVX
4-th gen i-series - SSE 1, SSE 2, SSE 3, SSSE 3,
SSE 4.1,
SSE 4.2,
AVX,
AVX2, FMA3
From AMD the latest CPUs and APUs support SSE 4.1, SSE 4.2 and AVX as well as FMA3 and FMA4. They did quite a leap recently in that matter. Unfortunately I sold my FX 8320 and don't have any new APU either.
And now we know which CPUs to test with to compare impact of instruction set on the ST performance.
Personally, as this is more widely available and supported, I would opt for SSE and AVX optimizations and not AVX2, especially if the latter needs a lot of effort to achieve.