Stereo Tool
https://forums.stereotool.com/

Stereo Tool 7.51 BETA
https://forums.stereotool.com/viewtopic.php?t=5635
Page 17 of 44

Author:  hvz [ Tue Sep 30, 2014 6:21 pm ]
Post subject:  Re: Stereo Tool 7.51 BETA

Quote:
Hi Hans,
Quote:
Wooh! If I enable all the new optimizations for i3/5/7 processors, I can run a complete FM preset including composite clipper and even Stokkemask with a CPU load of 10%! (CPU: i7-4770S). Without Stokkemask 7%! Streaming preset 2%.
For those of use who use Intel i7 4770K or 4790K systems this type of CPU load figure is excellent news.

Is there a chance there could be a build for those CPUs to allow a machine to run an FM and a Stream version at the same time and still keep the machine just happily working away under so little load?

C.
Yes, soon. I have a few more tests that I want to do to make sure what I can and cannot do to improve the performance. I think 4 more DLL's... First 2 are coming now.

Author:  hvz [ Tue Sep 30, 2014 6:25 pm ]
Post subject:  Re: Stereo Tool 7.51 BETA

Here are 2 new DLL's. I don't see any difference between these two and the NEW/NEW build from before, but I keep having some weird problem with my test system (it BSOD's regularly, and its speed doesn't seem to be constant). So, measurements from others would be very useful... Based on the settings used these two might be slightly faster than the NEW/NEW version.
http://www.stereotool.com/download/dsp_ ... 30-fas.dll
http://www.stereotool.com/download/dsp_ ... 30-nei.dll

Based on the result of this measurement, I will choose one of the 3 versions (NEW/NEW, FAS or NEI) to continue with. And at that point I'm going to try what happens to it when I add the i3/5/7 optimizations. Which will determine if I need to release 2 separate versions (which would be very annoying) or can combine them in a single version.

Author:  Slawomir B. [ Tue Sep 30, 2014 7:04 pm ]
Post subject:  Re: Stereo Tool 7.51 BETA

Quote:
Here are 2 new DLL's. I don't see any difference between these two and the NEW/NEW build from before, but I keep having some weird problem with my test system (it BSOD's regularly, and its speed doesn't seem to be constant). So, measurements from others would be very useful... Based on the settings used these two might be slightly faster than the NEW/NEW version.
http://www.stereotool.com/download/dsp_ ... 30-fas.dll
http://www.stereotool.com/download/dsp_ ... 30-nei.dll

Based on the result of this measurement, I will choose one of the 3 versions (NEW/NEW, FAS or NEI) to continue with. And at that point I'm going to try what happens to it when I add the i3/5/7 optimizations. Which will determine if I need to release 2 separate versions (which would be very annoying) or can combine them in a single version.
Since tests take some time - which CPUs do you prefer to test with first? i5 3210m, i3-3220T, E2220, i3-370m, Athlon II x4 620?

EDIT: Same 6:11 track and conditions as yesterday, King's Fire preset

i5-3210m

FAS: 1:33
NEI: 1:34

These are identical to the new/new

Athlon II x4 620

FAS: 2:57
NEI: 2:57

Nearly identical to the new/new

Author:  hvz [ Tue Sep 30, 2014 9:11 pm ]
Post subject:  Re: Stereo Tool 7.51 BETA

Quote:
Quote:
Here are 2 new DLL's. I don't see any difference between these two and the NEW/NEW build from before, but I keep having some weird problem with my test system (it BSOD's regularly, and its speed doesn't seem to be constant). So, measurements from others would be very useful... Based on the settings used these two might be slightly faster than the NEW/NEW version.
http://www.stereotool.com/download/dsp_ ... 30-fas.dll
http://www.stereotool.com/download/dsp_ ... 30-nei.dll

Based on the result of this measurement, I will choose one of the 3 versions (NEW/NEW, FAS or NEI) to continue with. And at that point I'm going to try what happens to it when I add the i3/5/7 optimizations. Which will determine if I need to release 2 separate versions (which would be very annoying) or can combine them in a single version.
Since tests take some time - which CPUs do you prefer to test with first? i5 3210m, i3-3220T, E2220, i3-370m, Athlon II x4 620?

EDIT: Same 6:11 track and conditions as yesterday, King's Fire preset

i5-3210m

FAS: 1:33
NEI: 1:34

These are identical to the new/new

Athlon II x4 620

FAS: 2:57
NEI: 2:57

Nearly identical to the new/new
Thanks! So roughly the same, maybe slightly better.

I would be interested in the result on the E2220 system. But so far, these numbers are good enough to pick one of these 3 for the next step. So, I'll post another version in a few minutes. Which I *hope* will have the same behavior for the E2220 and Athlon, but should be better on the i3.

Author:  hvz [ Tue Sep 30, 2014 9:29 pm ]
Post subject:  Re: Stereo Tool 7.51 BETA

New test version:
http://www.stereotool.com/download/dsp_ ... as-dua.dll

I expect this version to be faster on the i3. It should be identical (at least I hope it is) on the other systems.

I think I've found out what's wrong with my pc, I hope to be able to confirm my suspicion tomorrow.

Author:  evrenselFM [ Tue Sep 30, 2014 10:14 pm ]
Post subject:  Re: Stereo Tool 7.51 BETA

hello where is the stereo tool alone stand vercion we are waiting for this please upload

Author:  DJ-DOGGY [ Wed Oct 01, 2014 12:10 am ]
Post subject:  Re: Stereo Tool 7.51 BETA

If we look at the Red bar for the CPU load in the ST . I think that NEW/NEW shows a bit the "lowest".
Others looks identical.

Author:  Slawomir B. [ Wed Oct 01, 2014 12:16 am ]
Post subject:  Re: Stereo Tool 7.51 BETA

Here come the results for E2220 (same conditions, same 6:11 track as last time, King's Fire preset):

FAS: 3:18
NEI: 3:19
FAS DUA: 3:18

(almost identical to new/new)

i5 3210m (2c/4t):

FAS DUA: 1:35 (checked twice, around 1:35/1:34)

unfortunately not faster in comparison with:
FAS: 1:33
NEI: 1:34

Athlon II x4 620:

FAS DUA: 2:58

in comparison with
FAS: 2:57
NEI: 2:57

I measured manually so +/- one second, therefore results should be considered very similar if not identical.

Author:  hvz [ Wed Oct 01, 2014 2:37 am ]
Post subject:  Re: Stereo Tool 7.51 BETA

Ow. I just looked up the i5 3210 and it doesn't support AVX2 yet! Woops... Tomorrow I'll try if there's a difference in performance between the AVX and AVX2 versions (could very well be that they are identical and then I can just build an AVX version).

What I changed in the fas_dua version is that it contains 2 code paths, one for SSE2 and one for AVX2. This used to be slower in the past (10.1 compiler), which is why I've had a separate download for non-SSE2 systems. And... in your test it again seems to be (very marginally) slower.

Problem right now is that my test system is unreliable, but I need to to figure out if the AVX2 version is faster than AVX. :( and if there are other paths that might also be faster on E2220's etc. (SSE3, SSE4.1, ...). I could basically generate all those paths but I don't know if the overhead of selecting the proper path will be more than the gain - and I need a reliable Haswell test system to test that. So... this may take a few more days.


@evrenselFM: I first need to know which settings generate the fastest code, then I'll set up and upload all the different versions. But there's no difference in behavior so for now you can just use the last beta that I uploaded.

Author:  Slawomir B. [ Wed Oct 01, 2014 10:26 am ]
Post subject:  Re: Stereo Tool 7.51 BETA

Quote:

Problem right now is that my test system is unreliable, but I need to to figure out if the AVX2 version is faster than AVX. :( and if there are other paths that might also be faster on E2220's etc. (SSE3, SSE4.1, ...).
For the tests with SSE, actually Wolfdale based Core 2 Duo's (e.g. E8x00 series) or Penryn based Core 2 Quads (e.g. Q9x50) would be better as they make use of SSE 4.1. The Q9550 for instance consists of two E8300 units on the same die. These processors are the last from the Core 2 generation, just before Nehalem - 1-st gen i-architecture was released (only then, with Nehalem, SSE 4.2 was introduced). As far as I remember Bojcha has E8400. Mobile T8300 (or similar) would be good as well.

E2220 would make use only up to supplemental SSE3, similarly to E6400, Q6600 and all Conroe dual core/Kentsfield quad core based CPUs.

With first gen i-series, such as i7 920 on the other hand, a full set of SSE instructions (including SSE 4.2) can be checked (without AVX), similarly to mobile i3-370m. The latter I can test if my Girlfriend allows me to ;)

So to put it straight, as far as Intel goes:

1-st gen Core 2 Conroe/Kentsfield (and their mobile versions) - SSE 1, SSE 2, SSE 3, SSSE 3
2-nd gen Core 2 Wolfdale/Penryn (and their mobile versions) - SSE 1, SSE 2, SSE 3, SSSE 3, SSE 4.1
1-st gen i-series - SSE 1, SSE 2, SSE 3, SSSE 3, SSE 4.1, SSE 4.2
2-nd/ 3-rd gen i-series - SSE 1, SSE 2, SSE 3, SSSE 3, SSE 4.1, SSE 4.2, AVX
4-th gen i-series - SSE 1, SSE 2, SSE 3, SSSE 3, SSE 4.1, SSE 4.2, AVX, AVX2, FMA3

From AMD the latest CPUs and APUs support SSE 4.1, SSE 4.2 and AVX as well as FMA3 and FMA4. They did quite a leap recently in that matter. Unfortunately I sold my FX 8320 and don't have any new APU either.

And now we know which CPUs to test with to compare impact of instruction set on the ST performance.

Personally, as this is more widely available and supported, I would opt for SSE and AVX optimizations and not AVX2, especially if the latter needs a lot of effort to achieve.

Page 17 of 44 All times are UTC+02:00
Powered by phpBB® Forum Software © phpBB Limited
https://www.phpbb.com/