Stereo Tool https://forums.stereotool.com/ |
|
Stereo Tool 7.03 BETA https://forums.stereotool.com/viewtopic.php?t=4448 |
Page 76 of 102 |
Author: | radiofreak [ Mon Mar 11, 2013 10:05 pm ] |
Post subject: | Re: Stereo Tool 7.03 BETA |
Quote: I don't think so, i have many many schematics where hardware RDS and stereo decoders are completly independent, maybe in Germany was something different... I don't know... In Poland it's doesn't matter 90 or 0, some, station has 0 some 90. Specification allows RDS subcarrier 0/180 and +/-90 to 3rd harmonic of pilot tone, +/-90 is better.
AFAIK there is no (longer a) rule whether 90° or 0° here in Germany. I've analysed some stations with the Pira around here. Some are 90°, most are 0° and one was even between (should I contact them? ![]() When we license our station year in and year out, the regulation authority doesn't care what we do in and with our MPX, unless the max. deviation is <=75 kHz and the MPX-Power is <=0dBr - exept we want to use RDS-TA. Then, the local police office has to be informed... |
Author: | Brian [ Tue Mar 12, 2013 1:00 am ] |
Post subject: | Re: Stereo Tool 7.03 BETA |
Quote: @Brian: I have checked all the vectorization reports of performance sensitive functions, and for nearly all of them I either made them vectorize or I understand why they aren't.
So, you now have the explanation for this and understand why it happens????http://software.intel.com/en-us/forums/topic/346811 Quote:
I'm trying to squeeze the last bit of optimization out of a program using Intel C++ 10.1 (because with later versions I'm getting slower code - I'll look into that later).
When looking at the vectorization reports, I noticed 2 things I hadn't expected, and I wonder if they can be solved (without rewriting lots of code - total code base is over 2 MB and I'm working on it alone). I've tried to google them but didn't find any useful answers. This one seems to be the most important: fft_abs_sse2[2*cc] = max(fft_abs_sse2[2*cc], strength * m); .\Clip1Ch.cpp(1999): (col. 13) remark: vector dependence: proven ANTI dependence between fft_abs_sse2 line 1999, and fft_abs_sse2 line 1999. .\Clip1Ch.cpp(1999): (col. 13) remark: vector dependence: proven ANTI dependence between fft_abs_sse2 line 1999, and fft_abs_sse2 line 1999. .\Clip1Ch.cpp(1999): (col. 13) remark: vector dependence: proven FLOW dependence between fft_abs_sse2 line 1999, and fft_abs_sse2 line 1999. .\Clip1Ch.cpp(1999): (col. 13) remark: vector dependence: proven FLOW dependence between fft_abs_sse2 line 1999, and fft_abs_sse2 line 1999. .\Clip1Ch.cpp(1999): (col. 13) remark: vector dependence: proven ANTI dependence between fft_abs_sse2 line 1999, and fft_abs_sse2 line 1999. ... While I know that there's an _mm_max_ SIMD instruction. Problem might be the definition of max, I'm using: #define max(a,b) (((a)>(b)) ? (a) : (b)) The compiler might see this as an if instruction if it's unable to optimize everything out. Is there a better definition for max that doesn't cause the compiler to see dependencies where there are none? Another situation that occurs very frequently in my code is this: for (int c=0; c<f1; c++) { temp[2*c] *= one_DIV_bass_static_clip_level_dynamic; temp[2*c+1] *= one_DIV_bass_static_clip_level_dynamic; } Clearly, there are no dependencies between temp[2*c] and temp[2*c+1], but the compiler thinks otherwise: .\Clip1Ch.cpp(797): (col. 9) remark: loop was not vectorized: existence of vector dependence. .\Clip1Ch.cpp(800): (col. 13) remark: vector dependence: proven FLOW dependence between temp line 800, and temp line 799. .\Clip1Ch.cpp(800): (col. 13) remark: vector dependence: proven ANTI dependence between temp line 800, and temp line 799. .\Clip1Ch.cpp(800): (col. 13) remark: vector dependence: proven OUTPUT dependence between temp line 800, and temp line 799. I think if these two situations are solved at least 50% of the loops that currently don't get vectorized will be. Your help is greatly appreciated ![]() Quote:
The changes I made today should give a reduction of about 4% in the total CPU load (on the most active CPU core; reduction should be bigger on a single core system).
BETA052 resulted in 3-5% decrease. So, no, not any bigger than what you said for that one. You should also note that you said 4% via your writing to a file method of testing, but my 3-5% (average of 4) was from simply looking at Task Manager. IOW, it's not always inaccurate... Looking at things through ProcExp, I'm not seeing any DPC activity either. Never have. There's some interrupt stuff every now and then, but it's minimal.I still believe that your code on K8 is cache (aka core) clock dependent for most things, with benefit to the Opteron and X2 line with whatever supports multicore processing. ![]() That's why I asked about Scalar vs. Packed. It's also why I've mentioned the MOVNTPS instruction. MOVNTPS helps minimize cache pollution, which would be beneficial to all systems, if you can use it, which you may or may not be able to. |
Author: | hvz [ Tue Mar 12, 2013 1:20 am ] |
Post subject: | Re: Stereo Tool 7.03 BETA |
Quote: So, you now have the explanation for this and understand why it happens????
No, I worked around it.Quote:
It's also why I've mentioned the MOVNTPS instruction. MOVNTPS helps minimize cache pollution, which would be beneficial to all systems, if you can use it, which you may or may not be able to.
I almost never write data to memory that I don't need very quickly afterwards. So I don't expect much from this (I have used it at work in the past, so yes, I know what it does. But it's mainly useful if you're working on large amounts of data, not for short blocks of audio data.)Edit: Just to make sure: This 3-5% is what you get when you divide the old CPU load by the new one? Absolute numbers don't mean much, I'm of course talking about relative changes. Task Manager is often accurate, but sometimes it's not at all, and I really don't know when to trust it and when not. |
Author: | Brian [ Tue Mar 12, 2013 2:11 am ] |
Post subject: | Re: Stereo Tool 7.03 BETA |
Quote: Quote: So, you now have the explanation for this and understand why it happens????
No, I worked around it.
Quote:
I almost never write data to memory that I don't need very quickly afterwards. So I don't expect much from this (I have used it at work in the past, so yes, I know what it does. But it's mainly useful if you're working on large amounts of data, not for short blocks of audio data.)
I tended to doubt it could be used, but worth a shot. Semi-related to that is SFENCE, but again, not sure if it would help.Quote: Edit: Just to make sure: This 3-5% is what you get when you divide the old CPU load by the new one? Absolute numbers don't mean much, I'm of course talking about relative changes.
No, I did give you absolute. Relative would be 5-8%. Still not a huge delta, given that I'm already pushing up near 80 if quality is set to 100.Quote: Task Manager is often accurate, but sometimes it's not at all, and I really don't know when to trust it and when not.
I see your point, which is why I pointed you in the direction of Process Explorer and Process Monitor.
|
Author: | hvz [ Tue Mar 12, 2013 2:28 am ] |
Post subject: | Re: Stereo Tool 7.03 BETA |
Next version will be about 7-10% faster than 052, measured again with PhantomFM's 80s preset. (I'm getting some jitter in my measurements, and it's too late to repeat it a lot of times). Those are again relative numbers. Difference on single core systems is probably again bigger. By the way, 45% of the CPU cycles are now spent in an external Intel library that I have no influence on (except calling it less often of course), so improving things is getting increasingly harder. Improvements in this new version are in the compressors and in the advanced clipper. |
Author: | gpagliaroli [ Tue Mar 12, 2013 3:15 am ] |
Post subject: | Re: Stereo Tool 7.03 BETA |
All optimizations are welcome, and in the beta 52 is notice some of them, my system will thank you. ![]() Take this opportunity to make a comment about the activation of Side Chain Compressor both the AGC and the SingleBand. I think it confuses the fact activation, first "Use Side Chain" and the other "PEQ Sidechain". I think that when you activate the "Use Side Chain", you should activate the "PEQ Sidechain" or directly take this last check, because without the EQ sidechain is meaningless. Another control display has a problem, is the "Drive" the MB, where the maximum is 42 dB, and put half the control is 36 dB. Is there a problem of scaling or calculation of dB. With the "Output level" goes something like this. |
Author: | Brian [ Tue Mar 12, 2013 7:27 am ] |
Post subject: | Re: Stereo Tool 7.03 BETA |
Quote:
By the way, 45% of the CPU cycles are now spent in an external Intel library that I have no influence on (except calling it less often of course), so improving things is getting increasingly harder.
Might be time to revisit the 13 vs. 10.1 compiler situation. I mean for future enhancements, not for this release. Then with that, I also don't mean 5-10 releases down the road, but stopping after this release and trying to figure out that issue.
|
Author: | hvz [ Tue Mar 12, 2013 9:22 am ] |
Post subject: | Re: Stereo Tool 7.03 BETA |
Another performance improvement (probably about 9% compared to previous beta!) Stand alone: http://www.stereotool.com/download/ster ... 04-053.exe Winamp DSP: http://www.stereotool.com/download/dsp_ ... 04-053.exe VST: http://www.stereotool.com/download/vst_ ... 04-053.dll |
Author: | hvz [ Tue Mar 12, 2013 9:27 am ] |
Post subject: | Re: Stereo Tool 7.03 BETA |
Quote: Quote:
By the way, 45% of the CPU cycles are now spent in an external Intel library that I have no influence on (except calling it less often of course), so improving things is getting increasingly harder.
Might be time to revisit the 13 vs. 10.1 compiler situation. I mean for future enhancements, not for this release. Then with that, I also don't mean 5-10 releases down the road, but stopping after this release and trying to figure out that issue.![]() |
Author: | phantomfm [ Tue Mar 12, 2013 10:06 am ] |
Post subject: | Re: Stereo Tool 7.03 BETA |
Quote: Another performance improvement (probably about 9% compared to previous beta!)
For me, a 4% inprovement so total CPU load is now 29%! Well done !
Stand alone: http://www.stereotool.com/download/ster ... 04-053.exe Winamp DSP: http://www.stereotool.com/download/dsp_ ... 04-053.exe VST: http://www.stereotool.com/download/vst_ ... 04-053.dll |
Page 76 of 102 | All times are UTC+02:00 |
Powered by phpBB® Forum Software © phpBB Limited https://www.phpbb.com/ |