All times are UTC+02:00




Post new topic  Reply to topic  [ 1012 posts ]  Go to page Previous 121 22 23 24 25102 Next
Author Message
PostPosted: Fri Feb 01, 2013 5:49 pm 

Joined: Sun Dec 12, 2010 2:26 pm
Posts: 885
Quote:
3. @Brian: Sorry, I really don't like the idea of sending debug files to anyone...
I know Hans. I realize what I'm asking. The difficulty here is because you don't have the same platform, you don't have a proper test bed. Changes that you make that work fine on your system could be beneficial to mine, or they could be detrimental, or vice-versa.

Example, with your thought about inlining:

A Core2Duo E8400 has 32Kx2 of L1 (Instruction + Data). My Athlon 64 3700+ has 64Kx2 of L1 (Instruction + Data). Since all you're doing with all the other filters disabled is exercising the very basic core and the L1, there is indeed a reason why I saw no change vs. Bojcha seeing a degradation - my system has more L1 to use.

Additionally, I've found out that the previous discussion of 128-bit SSE is very relevant. The 4x32-bit ints that you're sending to that SSE shuffle function *DO* get split into 2 groups, with one set of 2 going to the upper XMM and the other going to the lower XMM. On K10 and newer, as well as Core (all versions), a 128-bit SSE instruction is handled as a single 128-bit instruction, not halved.
Quote:
Except for that, it probably wouldn't be too useful, because even with those files the profiling software that I use is unable to figure out from which line in the code - and even from which function or class - the CPU usage comes.
What's extremely frustrating to me is the knocking of the effort without even attempting it. That's the same situation as I encountered for 2 years at my former job. The idea ultimately worked, and has been tremendously beneficial, but for 2 years my management made the decision that they "knew best" because they had a "big picture" view of things to be happening "in the future", which I did not.

During those 2 years, I spent an average of 12-14 hours a day at the job, instead of 8 a day. Quite a few of those days were up to 16 hours (7AM to 11PM). Rarely, I worked from 6AM to 2AM, for 2 or 3 days at a time. This was due to support, not development issues. The management, naturally, never pulled those types of hours. I was on salary, not hourly, and the expectation was for me to do exactly what I did, which was to stay as long as it took to get things done. If I had decided to bail out and say I was done for the day, that would've caused me to be fired.

So, here's what I'll do, Hans. I'll run a profile and then I'll provide you with a couple of hex addresses. You can tell me if this relates to your code or not.


Last edited by Brian on Fri Feb 01, 2013 6:50 pm, edited 1 time in total.

Top
   
PostPosted: Fri Feb 01, 2013 6:30 pm 
User avatar

Joined: Fri Oct 08, 2010 3:58 am
Posts: 304
No VST again :x

... sometimes I get the feeling if I am the only one paranoid to use the VST plugin.

Anyways coming to point, with the latest beta, the DSP version seems to load incorrect values for parameters specifying peak values either in % or dB

_________________
visit website


Top
   
PostPosted: Fri Feb 01, 2013 6:32 pm 

Joined: Sun Dec 12, 2010 2:26 pm
Posts: 885
FYI, beta 25 resulted in no improvement for me. I don't think you understand that when I say that, I mean I switced *OFF* the old compressor and switched *ON* the new, so I am indeed tracking the changes you're making, it's just they don't improve the situation for me *AT ALL*, because the problem is elsewhere!

Profiling info:

CS:EIP values

0x205e468
0x205e475
0x205e7fb
0x205e801
0x205e47b

The actual largest is 0x208bffc, but that has a Symbol + Offset of "winampDSPGetHeader2". The rest are "NO SYMBOL".

Drilling down into GetHeader2, I've got the largest amount of time being spent there at CS:EIP 0x218742c, which is winampDSPGetHeader2+1029168, in the format of Symbol+Offset.

I can't drill down on the other stuff, which doesn't have symbolic info, which is likely your stuff.

From my understanding, the Symbol+Offset value should at least be able to point you in a general ballpark.


Last edited by Brian on Fri Feb 01, 2013 7:34 pm, edited 1 time in total.

Top
   
PostPosted: Fri Feb 01, 2013 6:38 pm 

Joined: Sun Dec 12, 2010 2:26 pm
Posts: 885
http://www.codeproject.com/Articles/459 ... ne-from-of

Introduction

Consider that your customer has reported you an error with the offset number; you can determine the line in the source which caused the error. This article explains how the erroneous source line can be detected using the offset address in release exe.

Advantage of this method is that it does not require neither to send any extra programs nor a debug exe to your customer nor rebuilding your program like explained in other articles submitted. Disadvantage is that you have to spend some extra effort by digging in two extra compiler generated file types: a .map and some .cod files.

A *.map file basically includes base addresses of compiled functions. A *.cod file generally includes Assembly, machine and source codes, if you apply settings below.

[snip out rest - refer to actual page for remainder of discussion]


Last edited by Brian on Fri Feb 01, 2013 7:32 pm, edited 1 time in total.

Top
   
PostPosted: Fri Feb 01, 2013 6:44 pm 

Joined: Sun Dec 12, 2010 2:26 pm
Posts: 885
Conclusion:

If you can isolate a crash using the above technique, you can isolate hotspots the same way. The difficulty is that since the code doesn't crash, I can't give you an offset. To give you an offset, I have to have the debug files.

Well, you could also attempt to make the code crash........ :lol:

How I arrived at the CS:EIP values was instead of using event counters (event-based profiling), I just used time-based profiling. The CS:EIP values are what the profiler found dsp_stereo_tool.dll spending the most timer cycles in...


Top
   
PostPosted: Fri Feb 01, 2013 7:04 pm 

Joined: Sun Jul 05, 2009 11:17 am
Posts: 36
Quote:
No VST again :x

... sometimes I get the feeling if I am the only one paranoid to use the VST plugin.
I also use the VST. Guess VST users are just low on the totem pole. :lol:


Top
   
PostPosted: Fri Feb 01, 2013 7:13 pm 

Joined: Sun Dec 12, 2010 2:26 pm
Posts: 885
Investigating some more, and this is what I can see when I drill down on in_mp3.dll, which belongs to AOL (Winamp). They were apparently comfortable enough to provide assembly information... :shrug:

Where I placed the :arrow: is where the largest number of timer samples happened in that DLL.

Address Code Bytes Instruction Symbol Timer samples
0x7715e80 0x 51 push ecx 0
0x7715e81 0x 53 push ebx 0
0x7715e82 0x 55 push ebp DeleteAudioDecoder+11215 0.43
0x7715e83 0x 56 push esi DeleteAudioDecoder+11216 0.43
0x7715e84 0x 8B 70 24 mov esi,[eax+24h] DeleteAudioDecoder+11217 1.29
0x7715e87 0x 57 push edi DeleteAudioDecoder+11220 0.43
0x7715e88 0x 8B 78 1C mov edi,[eax+1ch] 0
0x7715e8b 0x 8B D7 mov edx,edi DeleteAudioDecoder+11224 0.43
0x7715e8d 0x C1 FA 03 sar edx,03h 0
0x7715e90 0x 83 E2 FE and edx,feh 0
0x7715e93 0x 66 0F B6 0C 16 movzx cx,byte [esi+edx] DeleteAudioDecoder+11232 0.43
:arrow: 0x7715e98 0x 66 0F B6 6C 16 01 movzx bp,byte [esi+edx+01h] DeleteAudioDecoder+11237 3
0x7715e9e 0x 66 C1 E1 08 shl cx,08h 0
0x7715ea2 0x 66 0B E9 or bp,cx DeleteAudioDecoder+11247 0.43
0x7715ea5 0x 66 8B DF mov bx,di 0
0x7715ea8 0x 66 83 E3 0F and bx,0fh 0
0x7715eac 0x 66 8B CB mov cx,bx 0
0x7715eaf 0x 66 D3 E5 shl bp,cl 0
0x7715eb2 0x 0F B7 CD movzx ecx,bp DeleteAudioDecoder+11263 1.29
0x7715eb5 0x 89 4C 24 10 mov [esp+10h],ecx DeleteAudioDecoder+11266 0.86
0x7715eb9 0x 8B CF mov ecx,edi 0
0x7715ebb 0x 83 E1 0F and ecx,0fh 0
0x7715ebe 0x BD 10 00 00 00 mov ebp,00000010h DeleteAudioDecoder+11275 0.86
0x7715ec3 0x 2B E9 sub ebp,ecx 0
0x7715ec5 0x 39 6C 24 18 cmp [esp+18h],ebp 0
0x7715ec9 0x 76 27 jbe $+29h (0x7715ef2) DeleteAudioDecoder+11286 1.29
0x7715ecb 0x 8B 48 08 mov ecx,[eax+08h] 0
0x7715ece 0x 49 dec ecx DeleteAudioDecoder+11291 0.86
0x7715ecf 0x 83 C2 02 add edx,02h 0
0x7715ed2 0x 23 CA and ecx,edx 0
0x7715ed4 0x 66 0F B6 54 0E 01 movzx dx,byte [esi+ecx+01h] 0
0x7715eda 0x 66 0F B6 0C 0E movzx cx,byte [esi+ecx] 0
0x7715edf 0x 66 C1 E1 08 shl cx,08h 0
0x7715ee3 0x 66 0B D1 or dx,cx 0
0x7715ee6 0x B1 10 mov cl,10h 0
0x7715ee8 0x 2A CB sub cl,bl 0
0x7715eea 0x 66 D3 EA shr dx,cl 0
0x7715eed 0x 66 09 54 24 10 or [esp+10h],dx DeleteAudioDecoder+11322 0.43
0x7715ef2 0x 8B 54 24 18 mov edx,[esp+18h] 0
0x7715ef6 0x 8B 48 0C mov ecx,[eax+0ch] DeleteAudioDecoder+11331 0.43
0x7715ef9 0x 01 50 18 add [eax+18h],edx DeleteAudioDecoder+11334 0.43
0x7715efc 0x 29 50 10 sub [eax+10h],edx 0
0x7715eff 0x 49 dec ecx 0
0x7715f00 0x 03 FA add edi,edx DeleteAudioDecoder+11341 0.43
0x7715f02 0x 23 F9 and edi,ecx DeleteAudioDecoder+11343 0.43
0x7715f04 0x 89 78 1C mov [eax+1ch],edi 0
0x7715f07 0x 0F B7 44 24 10 movzx eax,word [esp+10h] DeleteAudioDecoder+11348 0.43
0x7715f0c 0x 5F pop edi DeleteAudioDecoder+11353 0.43
0x7715f0d 0x 5E pop esi 0
0x7715f0e 0x B9 10 00 00 00 mov ecx,00000010h 0
0x7715f13 0x 2A CA sub cl,dl 0
0x7715f15 0x 5D pop ebp 0
0x7715f16 0x D3 E8 shr eax,cl DeleteAudioDecoder+11363 1.72
0x7715f18 0x 5B pop ebx 0
0x7715f19 0x 59 pop ecx DeleteAudioDecoder+11366 0.86
0x7715f1a 0x C2 04 00 retnd 0004h DeleteAudioDecoder+11367 0.43


Top
   
PostPosted: Fri Feb 01, 2013 10:43 pm 

Joined: Tue Aug 02, 2011 5:24 pm
Posts: 101
Quote:
BETA025: Optimized compressor code.
Stand alone: http://www.stereotool.com/download/ster ... 04-025.exe
Winamp DSP: http://www.stereotool.com/download/dsp_ ... 04-025.exe


Edit: O wow.
Performance numbers for my measurement, FLAC -> ST -> MP3, amount of audio processed in 15 seconds:
- Old version with compressor enabled: 3:00 --> CPU load 8.333%
- New version with compressor enabled: 3:30 --> CPU load 7.143%
- Compressor disabled: 5:30 --> CPU load 4.545%
Ignoring the offset (4.545%) that means the CPU loads are 3.788% and 2.597%. So indeed a drop of over 30%!

New compressor peak mode: 4:15 --> CPU load 5.882% (1.337%), which means that the Smart (RMS or peak) detection and the compression itself each take about 1.3% on my system.

I know this is still a lot more than what the old compressor uses (I seem to see a CPU usage of 0.2%, but that could very well just be measurement noise).

No VST :(


Top
   
PostPosted: Fri Feb 01, 2013 10:58 pm 
Site Admin
User avatar

Joined: Mon Mar 17, 2008 1:40 am
Posts: 11425
Will build the VST version later tonight... I just ran a quick build this afternoon to get some feedback on the performance improvement, didn't have time to build all 3 versions.


@Brian: I'm looking at the .map file but the numbers you are reporting are not at all available in the .map file - nothing even remotely close to it. I'll try to connect a debugger and see if I can figure out the locations. (And yes, I did try the winampGetDspHeader2 address and offset, no luck because the .map file contains different segments and the values seem to map to a data segment - which doesn't make sense.)

Edit: Wow, was looking at the wrong column! Now retrying ;)


Top
   
PostPosted: Fri Feb 01, 2013 11:09 pm 
User avatar

Joined: Sun Nov 06, 2011 1:38 am
Posts: 38
Location: Geneva
Quote:
Quote:
No VST again :x

... sometimes I get the feeling if I am the only one paranoid to use the VST plugin.
I also use the VST. Guess VST users are just low on the totem pole. :lol:
I agree with you always the same with us "aliens" VST users....

_________________
http://www.kanal80.net
Best 80's & 90's Music


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 1012 posts ]  Go to page Previous 121 22 23 24 25102 Next

All times are UTC+02:00


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Limited