[FFmpeg-devel] [PATCH] SIMD-optimized exponent_min() for ac3enc (original) (raw)

Justin Ruggles justin.ruggles
Mon Jan 17 14:10:43 CET 2011


On 01/16/2011 09:19 PM, Loren Merritt wrote:

On Sun, 16 Jan 2011, Justin Ruggles wrote:

Reversing the outer loop seems unrelated to what you've mentioned. I don't see how it helps. Is it actually faster to have an extra add instead of an offset in the load and store? The point was to make expq point to the base of the current inner loop. Any change in addressing of the outer loop is a side-effect, and isn't supposed to affect speed.

ok, I think I've got it now.

I was stuck at reading exp first, then comparing the following blocks, then I finally realized it doesn't matter. Now the inner loop starts at exp+offset and ends at exp, so sub+jae works fine.

New patch attached. The best benchmarks are pretty much the same, but the average speed is more consistently faster.

Thanks, Justin

-------------- next part -------------- A non-text attachment was scrubbed... Name: ac3_exponent_min.patch Type: text/x-patch Size: 12780 bytes Desc: not available URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20110117/785c74b7/attachment.bin>



More information about the ffmpeg-devel mailing list