Windows stand alone:
http://www.stereotool.com/download/ster ... 41-023.exe
Winamp DSP:
http://www.stereotool.com/download/dsp_ ... 41-023.exe
VST32:
http://www.stereotool.com/download/vst_ ... 41-023.dll
I'm seeing a rather large CPU load difference when using the composite LIMITER (not clipper!) which I didn't expect and don't understand. So I'm curious what others are seeing.
Changes:
- Hard Limit for composite clipper caused very soft clicks every block!!! Also in older versions...
- Composite Limiter was running in a separate thread, and taking 2 ms extra latency. The new version does not do that anymore, and returns a cleaner spectrum, but it requires a bit more horse power from the PC because it doesn't run on a separate CPU anymore.
Older changes:
- Added HQ mode Not available, for testing only
- Improve Multiband3 and Singleband2 limiting and (to a lesser extent) compression for low latency settings. LQ output should sound similar to normal output! Fixing this will also improve audio at lower latency settings. Compressor is probably more or less ok, limiter is pretty horrible, also at lower latencies!
- Fixed Phase Rotation frequency effects at low latencies (need to compensate for loss at certain freqs in low latency modes)
- Fixed AZIMUTH behavior at lower latencies
- AGC behaves slightly differently for lower latencies - Kinda OK. With shorter block size the drop for short spikes is bigger, which leads to a slightly lower overall output level. But I cannot easily fix that. Other differences are fixed now.
- Clipper (probably only ABDP) does not work well for latency 128. Yup -> if I lower the top bass freq from 400 to 200 Hz it's MUCH better. Fixed.
- Something removes low bass in low latency modes. -> EQ and other things. -> Improved. Difference is still large though.
Rewrote LQ Low Latency monitoring to use the normal processing code. Works reasonable, sound resembles that of the normal latency EXCEPT for the bass limiters and to a lesser extent the compressors in the multiband section. Memory usage for plugin version is reduced by more than 20 MB. Stand alone version might use slightly more than before.
- Fixed FOX TV Carbon Coder R128 normalization issue Waiting for feedback.
- Moved a lot of threads into a single thread. Might improve hiccups that some people have reported.
- Added Power Highs (it's in the same window as Power Bass).
- Moved Power Bass and Power Highs to before the wideband AGC to improve volume level consistency.
- Sudden fast rise of bass or highs is limited, new slider 'Release boost' added. I'm not really sure yet if this is ok; if there's a loud high or low sound it can push the band down a lot, and it comes down slower than before. If needed I can add something to allow it to come back faster after a short spike. Waiting for feedback first though.
- Sidechain checkbox removed (without that doesn't exist anymore).
Attempt #2: Redesigned Simple Clipper. Reduced CPU load.
- Reduced the memory usage
- Fixed most of the Stereo Image artifacts!!! "Deprecated" is removed from the sliders that were marked with it. See (*) for a cool new possibility!
- Removed some more unnecessary steps (AZIMUTH 2x, Stereo Boost 2x). 10 remaining.
Fixed 'Post filter for DC offset' problem.
52. Check CPU load. Start with checking if there's anything left that uses the 'unnecessary steps'. Sevdah Web preset: Data still gets converted 58 times... I think I need to do this one first, it should have some effect on the CPU load.
28 removed - next convert the 2 IIR filters so they can be optimized and the merge/split around it can be removed. I'm not measuring any effect from this though (but it makes the code simpler which is also good)
53. Noise Gate/Stereo Boost: Pre-calculate 1-cos() and sqrt() values.
55. Check MemoryPool behavior for cache improvements -> No effect measured, and might make behavior less constant.
56. Check if we can go in opposite direction for each next step to improve cache.
57. Check if lazy reverse FFT is an option. -> No, difficult and gain does not even seem to be measurable.
58. Created a separate class that performs the processing chain. Currently the same code is repeated twice (once for normal processing, once for low latency processing) - which means that a lot of code is duplicated and it's difficult to add extra chains. Most, not all, of that code is now moved elsewhere.
TO BE DONE:
- Spread over cores is not constant, which causes differences in performance. I *think* it might be the chain2() code that causes this. Actually it might be a good idea to get rid of that completely...
- Get rid of chain2() thread. This should also allow reducing the ASIO latency by 1 step (usually 1.5 ms). Hm.... Or not? I'm confused

- Composite Limiter effect no longer visible in GUI. (Is that bad?)
- New ASIO behavior: Push samples, read them back directly from buffer, skip whole Chain2 stuff. For HQ mode, add redundancy protection.
- Reduce stand alone version memory usage (unused low latency thread items can be removed.)
- Old Hard Limit for composite limiting was slightly tighter input level was very high. And had no overshoots; the new one does.