All times are UTC+02:00




Post new topic  Reply to topic  [ 719 posts ]  Go to page Previous 159 60 61 62 6372 Next
Author Message
 Post subject: Re: Stereo Tool 6.00
PostPosted: Thu Feb 03, 2011 1:57 am 
Site Admin
User avatar

Joined: Mon Mar 17, 2008 1:40 am
Posts: 11425
Quote:
Hans, have a look at these.
Aha, clear. It appears that there the majority (probably: all) the processing is done WITHIN the ASIO thread itself, instead of in a different thread. Chances are that the processing is done more 'smoothly', ie. in ST I use a block of - say - 512 samples audio and then do all the processing. Alternatively one could just process sample-by-sample, especially if you don't need/want phase linearity. Or: Split the processing into chunks, and perform a chunk on each ASIO call.

Load balancing over cores cannot really be the issue though, because it must also work on a single core system.


Top
   
 Post subject: Re: Stereo Tool 6.00
PostPosted: Thu Feb 03, 2011 2:05 am 

Joined: Sun Dec 12, 2010 2:26 pm
Posts: 885
Quote:
Brian: About the Intel compiler issue: I just realized that there currently _is_ no switching in Stereo Tool - I'm building completely separate versions for different systems, which are included in the installer executable, which automatically selects the correct one. The same is true for IPP: I *only* include the version that I need.

So in each version there's only a single code path which can be taken.
So, contained in the DSP installer, there are TWO versions, one with a /QaxN compiler option and one with a /QxN compiler option?

See page 23 here:

http://cache-www.intel.com/cd/00/00/34/ ... 347599.pdf
Quote:
This option tells the compiler to generate multiple, processor-specific code paths if there is a performance benefit. It also generates a generic IA-32 code path. The generic code is usually slower than the specialized code.
The generic code path is determined by the architecture specified by the -x (Linux and Mac OS) or /Qx (Windows) option. While there are defaults for the -x or /Qx option that depend on the operating system being used, you can specify an architecture for the generic code that is higher than the default. The specified architecture becomes the effective minimum architecture for the generic code path.
If you specify both the -ax and -x options (Linux and Mac OS) or the /Qax and /Qx options (Windows), the generic code will only execute on processors compatible with the processor type specified by the -x or /Qx option.
This option enables the vectorizer and tells the compiler to find opportunities to generate separate versions of functions that take advantage of features of the specified Intel® processor.
If the compiler finds such an opportunity, it first checks whether generating a processor-specific version of a function is likely to result in a performance gain. If this is the case, the compiler generates both a processor-specific version of a function and a generic version of the function. At run time, one of the versions is chosen to execute, depending on the Intel processor in use. In this way, the program can benefit from performance gains on more advanced Intel processors, while still working properly on older processors.


Top
   
 Post subject: Re: Stereo Tool 6.00
PostPosted: Thu Feb 03, 2011 2:09 am 
Site Admin
User avatar

Joined: Mon Mar 17, 2008 1:40 am
Posts: 11425
No, If you specify both the -ax and -x options (Linux and Mac OS) or the /Qax and /Qx options (Windows), the generic code will only execute on processors compatible with the processor type specified by the -x or /Qx option.
/QaxW /QxW in this case
Try running the SSE2 version on a Pentium 3 and it will fail.
Same for the IPP code: The 'generic' path is not included, only the P4 path.


Top
   
 Post subject: Re: Stereo Tool 6.00
PostPosted: Thu Feb 03, 2011 2:16 am 

Joined: Sun Dec 12, 2010 2:26 pm
Posts: 885
Quote:
No, If you specify both the -ax and -x options (Linux and Mac OS) or the /Qax and /Qx options (Windows), the generic code will only execute on processors compatible with the processor type specified by the -x or /Qx option.
/QaxW /QxW in this case
Try running the SSE2 version on a Pentium 3 and it will fail.
Same for the IPP code: The 'generic' path is not included, only the P4 path.
You're not understanding that when you specify axW, which you *HAVE TO* in order to be compatible with AMD (xW is for Intel only), inside the compiler it will end up making TWO codepaths, one for Intel and one that's "Generic IA-32". At runtime when the software determines that it's not running on an Intel processor, it will choose the "Generic IA-32" path.

Unless you hand-code things to vectorize it yourself, the compiler is going to do things on its' own when it makes the executable that you will not be notified of at all and will not show up as a performance issue on your system or any other Intel system until it runs on a non-Intel system at runtime.

Like I said, I've had experience in the distributed computing community, and nearly all of those projects produce SEPARATE AMD and Intel versions if they are using Intel compilers.

You might do some more research on this before dismissing it out of hand...


Top
   
 Post subject: Re: Stereo Tool 6.00
PostPosted: Thu Feb 03, 2011 2:29 am 
Site Admin
User avatar

Joined: Mon Mar 17, 2008 1:40 am
Posts: 11425
I couldn't quickly find it, but this thread is somewhat clear about it:
http://software.intel.com/en-us/forums/ ... hp?t=56928

Basically, /QaxN /QxN compiles for Intel P4, /QaxW /QxW for any SSE2-CPU.


Top
   
 Post subject: Re: Stereo Tool 6.00
PostPosted: Thu Feb 03, 2011 2:39 am 
Site Admin
User avatar

Joined: Mon Mar 17, 2008 1:40 am
Posts: 11425
Here's a version with extra logging. Basically, it logs:
- How much time there is between the start of ASIO thread calls
- How much time is spent INSIDE the ASIO thread call.
http://www.stereotool.com/download/radi ... ogging.exe

Log files are written to C:\temp\log (the directory C:\temp must exist!)

Could anyone with cracks (even at big buffer sizes - that's the issue I want to look at with this) run this, reproduce some cracks/pops, and then send the file to me? Thanks!


Top
   
 Post subject: Re: Stereo Tool 6.00
PostPosted: Thu Feb 03, 2011 2:42 am 

Joined: Sun Dec 12, 2010 2:26 pm
Posts: 885
Quote:
I couldn't quickly find it, but this thread is somewhat clear about it:
http://software.intel.com/en-us/forums/ ... hp?t=56928

Basically, /QaxN /QxN compiles for Intel P4, /QaxW /QxW for any SSE2-CPU.
OK, the W thing has me confused...

What's important is to know if it does anything at all different for non-Intel based on your compiler options. The claim from the Intel employee (bias?) is that QxW would steer non-Intel into SSE2, but the question is, is that SSE2 the same as the SSE2 if run on an Intel CPU? In other words, is there a "Intel SSE2" and a "Generic SSE2", or just "SSE2"?

Also, it's going to matter what version of the compiler you're using.


Top
   
 Post subject: Re: Stereo Tool 6.00
PostPosted: Thu Feb 03, 2011 2:54 am 
Site Admin
User avatar

Joined: Mon Mar 17, 2008 1:40 am
Posts: 11425
I've looked into the assembly-code that comes out a lot of times, and I've never seen multiple paths (at least not since I switched to compiling 2 different versions); basically forcing the compiler to create 2 separate versions removes the (not very accurate) determination of which functions should be SSE2-optimized and which shouldn't. So even on an Intel the separate version was about 5% faster (and a few hundred kB smaller).

The code does also contain some SSE2-intrinsics code. For the IPP library - as I mentioned earlier - I'm also only linking the SSE version of ipp 5.x for the ST SSE version and the SSE2 version from 6.x for the ST SSE2 version. (Also including the SSE3 and SSE4 versions only made things slower; code is the same but there's extra overhead to determine which path should be followed and whether the code must be multi-threaded. However, multi-threading is only useful for blocks BIGGER than 4096 samples, so the answer was always 'no' - so I just removed those versions).


By the way: I have created a GCC-build of the command line version some time ago (just before I created the Linux version). On Window, that was about 15% slower than the Intel version - on an Intel CPU, that is...


Top
   
 Post subject: Re: Stereo Tool 6.00
PostPosted: Thu Feb 03, 2011 3:01 am 

Joined: Sun Dec 12, 2010 2:26 pm
Posts: 885
Quote:
By the way: I have created a GCC-build of the command line version some time ago (just before I created the Linux version). On Window, that was about 15% slower than the Intel version - on an Intel CPU, that is...
Not surprising... Likely missing IPP...

I'm going to dig on the W option and how it pertains to AMDs... Like I said earlier, possible goose chase...

On the same subject (compiler options), are you also using O2 or O3?


Top
   
 Post subject: Re: Stereo Tool 6.00
PostPosted: Thu Feb 03, 2011 3:03 am 

Joined: Sun May 02, 2010 11:26 pm
Posts: 547
I've send you a mail with the log file.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 719 posts ]  Go to page Previous 159 60 61 62 6372 Next

All times are UTC+02:00


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Limited