| Stereo Tool https://forums.stereotool.com/ |
|
| Stereo Tool 7.03 BETA https://forums.stereotool.com/viewtopic.php?t=4448 |
Page 22 of 102 |
| Author: | hvz [ Thu Jan 31, 2013 1:10 am ] |
| Post subject: | Re: Stereo Tool 7.03 BETA |
BETA024: Stand alone: http://www.stereotool.com/download/ster ... 04-024.exe Winamp DSP: http://www.stereotool.com/download/dsp_ ... 04-024.exe VST: http://www.stereotool.com/download/vst_ ... 04-024.dll - Optimized the compressor code slightly - Also optimized 'overall' code a bit (stuff that gets executed even if all filters are disabled) - Changed delayed bass clipping protection a bit again, behavior should be better (make more sense); Strength slider is replaced by Relative sensitivity, which matches the other settings better. Behavior is changed to match the new name. Default value is 50%, for the Big O preset which was specifically made for this new feature I've set it at 25%. |
|
| Author: | Brian [ Thu Jan 31, 2013 3:05 am ] |
| Post subject: | Re: Stereo Tool 7.03 BETA |
Quote: BETA024:
No improvement for me. No surprise though.
|
|
| Author: | hvz [ Thu Jan 31, 2013 4:52 pm ] |
| Post subject: | Re: Stereo Tool 7.03 BETA |
Wow. I just discovered something crazy. I've always assumed that the compiler would replace things like log(10.0f) or sqrt(.5f) by a number - now it turns out that in many cases it does not! Ok then... #define SQRT_0_5 .70710678118654752440084436210485f #define SQRT_0_35 .59160797830996160425673282915616f #define DIV_SQRT_0_5 1.4142135623730950488016887242097f #define SQRT_2 1.4142135623730950488016887242097f #define SQRT_SQRT_2 1.1892071150027210667174999705605f #define LOG_10 3.3219280948873623478703194294894f #define LOG_2 1.0f #define LOG_2d 1.0 |
|
| Author: | hvz [ Thu Jan 31, 2013 5:20 pm ] |
| Post subject: | Re: Stereo Tool 7.03 BETA |
Quote: Quote: BETA024:
No improvement for me. No surprise though.For the compressor, I can optimize the code for certain situations (several sliders that might be useful make the code more complex, if I can assume them to be at 0 or at 1 I can remove several lines of code that are executed for each sample; will do that). But the pow() calls really use the most processing power. I'll check if it's an option to use a lookup table with interpolation (using existing optimized pow code isn't really an option because for audio other things are important than for general pow() calculations). The log() will have to come back for that to work - but that seems to be FAR heavy than pow(). Edit: No I can put the log() in the table as well.. Wow. CPU load will probably drop by half if I do all these things! |
|
| Author: | wiele [ Thu Jan 31, 2013 9:58 pm ] |
| Post subject: | Re: Stereo Tool 7.03 BETA |
Sounds super Hans! btw, testing the new singleband right now, sounds good! Can't wait for the multiband version |
|
| Author: | Brian [ Thu Jan 31, 2013 10:32 pm ] |
| Post subject: | Re: Stereo Tool 7.03 BETA |
Quote: Quote: Quote: BETA024:
No improvement for me. No surprise though.Quote: I inlined some code, maybe that wasn't a good idea - turned it off again now. Except for that I only removed code, so I really expect the next beta to be (very! slightly) faster.
I'll need to go back and get in the zone I was in while reading stuff last night, but I think inlining some things is suggested for AMD.Page 155 of the following PDF lists potential bottlenecks for K8 and K10. http://www.agner.org/optimize/microarchitecture.pdf In that PDF is discussion about processors from the original Pentium (P1), clear on up to i5 and i7 (Sandy Bridge) on the Intel side, Athlon (K7) through Bulldozer and Bobcat (post-K10, called family 15h) on the AMD side, and also the VIA Nano 2000 and 3000 cores. Quote:
For the compressor, I can optimize the code for certain situations (several sliders that might be useful make the code more complex, if I can assume them to be at 0 or at 1 I can remove several lines of code that are executed for each sample; will do that).
The bottleneck for my system does not reside in the code you're currently working on.I know what I'm asking for there, but as I've said, I have no desire to do anything other than figure out why my processor is choking so badly. As for the pow() and log stuff, I would guess you're talking about the last link I gave you about the adjustable accuracy code? |
|
| Author: | Bojcha [ Fri Feb 01, 2013 2:38 am ] |
| Post subject: | Re: Stereo Tool 7.03 BETA |
Quote: On which processor?
In have Intel core2duo E8400 OCed @ 4.5GHzAnyway.. i was testing DSP beta23 and 24 w/ all filters Off. (Resource Monitor in Windows7 Proffesional).. - in bypass, winamp repots constantly avarage CPU 0.00%. - w/o bypass and all filters off.. ~0.65% - beta24 - w/o bypass and all filters off.. ~0.85% |
|
| Author: | Brian [ Fri Feb 01, 2013 7:54 am ] |
| Post subject: | Re: Stereo Tool 7.03 BETA |
Quote: Quote: On which processor?
In have Intel core2duo E8400 OCed @ 4.5GHzAnyway.. i was testing DSP beta23 and 24 w/ all filters Off. (Resource Monitor in Windows7 Proffesional).. - in bypass, winamp repots constantly avarage CPU 0.00%. - w/o bypass and all filters off.. ~0.65% - beta24 - w/o bypass and all filters off.. ~0.85% Second, it's entirely possible that what changed was code was no longer properly aligned on memory boundaries. That can't be determined by Resource Monitor. It would need to be inspected with a profiling utility. In my own profiling, that counter (misaligned requests or something like that) was non-zero, but I want to say it was single-digit, so I didn't consider it to be significant. I'll retest later today to make sure I'm remembering correctly. Lastly, on my own system, if I use Process Explorer (Sysinternals, part of Microsoft), when Process Explorer is running, Winamp varies between 0 and 2. When Process Explorer is not running, and I'm just looking at Task Manager, Winamp stays at 0. Given this, what you are seeing could be just monitoring overhead. |
|
| Author: | hvz [ Fri Feb 01, 2013 5:05 pm ] |
| Post subject: | Re: Stereo Tool 7.03 BETA |
Ok, few answers. First, sorry for the long delay - my internet connection was gone yesterday evening and I had to be somewhere this morning. 1. Whether or not the pow() is "the" most important problem on your CPU doesn't really matter, removing it will help a lot regardless of that. Apparently (numbers I found online) a pow() takes about 120-150 clock cycles! So replacing it by 2 memory loads and a few float/int conversions should help a lot - and it does: On my system the total CPU load, measured by running ST in Winamp with FLAC input and MP3 output, decreased by about 20%. That may not seem like a lot but the other things also use some processing power so the compressor CPU load is reduced by more than 20%. If I use multiple instances of it in the new multiband compressor this will really have a noticeable difference on the total CPU load as well. 2. Inlining is a bad idea if the processor cache is smaller than what the software uses, because it causes the code to become bigger. That's most likely the cause of the performance change. 3. @Brian: Sorry, I really don't like the idea of sending debug files to anyone... Except for that, it probably wouldn't be too useful, because even with those files the profiling software that I use is unable to figure out from which line in the code - and even from which function or class - the CPU usage comes. This is caused by the very aggressive optimizations done by the compiler, which attempts to combine .obj files and performs inline-like actions on object files. So basically, the only way for me to figure out which line of code is responsible for using a lot of processing power is by generating .asm files and then - and even that isn't always easy - figure out with those ASM files to which line in code the assembly instruction corresponds. |
|
| Author: | hvz [ Fri Feb 01, 2013 5:10 pm ] |
| Post subject: | Re: Stereo Tool 7.03 BETA |
BETA025: Optimized compressor code. Stand alone: http://www.stereotool.com/download/ster ... 04-025.exe Winamp DSP: http://www.stereotool.com/download/dsp_ ... 04-025.exe VST: http://www.stereotool.com/download/vst_ ... 04-025.dll Edit: O wow. Performance numbers for my measurement, FLAC -> ST -> MP3, amount of audio processed in 15 seconds: - Old version with compressor enabled: 3:00 --> CPU load 8.333% - New version with compressor enabled: 3:30 --> CPU load 7.143% - Compressor disabled: 5:30 --> CPU load 4.545% Ignoring the offset (4.545%) that means the CPU loads are 3.788% and 2.597%. So indeed a drop of over 30%! New compressor peak mode: 4:15 --> CPU load 5.882% (1.337%), which means that the Smart (RMS or peak) detection and the compression itself each take about 1.3% on my system. I know this is still a lot more than what the old compressor uses (I seem to see a CPU usage of 0.2%, but that could very well just be measurement noise). |
|
| Page 22 of 102 | All times are UTC+02:00 |
| Powered by phpBB® Forum Software © phpBB Limited https://www.phpbb.com/ |
|