ok, i've got floaty ported to the new code. it runs a TINY bit slower (0.6% vs 0.5%) than the old one, i'm guessing it's just cache misses or something now that i'm processing left and right separately instead of interleaving them.
the code is way cleaner now though, much nicer :-)
|