Ok, I've done some reading into phase rotation filters now (couldn't find a decent description anywhere at first, but I did now).
Basically the idea seems to be extremely simple: Find a frequency in the center frequency (where half of the signal is above that frequency and half is below) of the audio signal, and invert the phase of the frequencies above that frequency.
For voice, the example mentions a frequency of 600 Hz.
If you use multiple such switches, the chance of catching other sounds with a different center frequency get bigger, so there's less chance that asymmetrical spikes remain.
Edit:
After reading this explanation, and looking at the waveform that Bojcha sent, I now understand how it works.
Basically the idea is:
- Cut the signal into a few frequency bands
- At the split between two bands, let the next band jump back in time (samplerate / frequency) samples.
Example:
Band 1: 0 - 1000 Hz
Band 2: 1000 - 4000 Hz
Band 3: 4000 - ... Hz
Band 1 is placed at position 0
Band 2 is placed 44100 Hz / 1000 Hz = 44 samples earlier than band 1
Band 3 is placed 44100 Hz / 4000 Hz = 11 samples earlier than band 2
(I could probably just as well place them later instead of earlier. Hm... Maybe I can even move band 3 in the opposite direction - testing now... --> No, output looks less good).
(1000 Hz and 4000 Hz are just some random numbers, if I look at Bojcha's wav file something close to this seems to have been used. Who knows, maybe I should even make them configurable

)
I've tried doing this with CoolEdit, and the result looks very good (actually even BETTER than Bojcha's waveform).
This is great. I just take a bunch of frequencies, and on those frequencies I perform a jump of exactly 1 sinusoid. Which is identical to performing no jump at all! But frequencies after this frequency do change. Now all I need to do is find a formula to calculate the phase shift per frequency, based on this description

(For each frequency, only a phase shift in range -π .. π suffices).
And I also think I can make a smooth transition filter from rotating to and from not rotating, with little effect on the sound.
Ah, this is getting easier than I expected.
At 1000 Hz (44 samples per sinusoid), the phase shift is 0.
At 2000 Hz (22 samples per sinusoid), I've moved the thing 44, a multiple of 22, so the phase shift is again 0. Same at 3000, 4000 etc.
Then let's look at 50% into a phase shift block (area between to 0-shifts).
So I think drawing a straight like between every multiple of 1000 suffices (eg. start at 0, end at 2*π=0)
To make things easier, demand that the 2nd frequency is a multiple of the first.
Then from there on, do the same thing at a slower pace. (4000 = 0, 8000 = 0, ...)
Going to test this now...
(Damn. I just realized - this is almost exactly what I tried yesterday. But then I threw it away and used another method because I thought this would change the waveform in a far too extreme way.)
Edit: Result: Works very well, if I set Loudness to 4 and increase the bass level, I still don't hear a difference between the iZotope-edited version and my own. I'm now going to test some other sounds.
Edit #2: This is actually a very rough way of saying: "If the frequency is 4 times as high, I want the phase shift to occur 4 times slower". Todo: Test what happens if I really make it smooth instead of with huge steps.
Edit #3: Smooth does indeed work better. I'm now really outperforming iZotope (at least on this James Last track).
Next: I'm going to try to reduce the frequency where I start to filter. (That might help with voices).
Edit: That also worked. I'm now building an update (will probably add 2 sliders to control the behavior - start frequency and initial between-0-phase-size later). Will post the update when it's ready (20 minutes from now I expect).
Edit: I've found already 2 "vibrating voice" occurrences that are COMPLETELY gone when I enable this new filter! This starts to look very promising!