Review Request: BigInteger patch for efficient multiplication and division (#4837946) (original) (raw)

Alan Eliasen eliasen at mindspring.com
Sat May 4 12:13:52 UTC 2013


On 05/02/2013 06:37 PM, Brian Burkhalter wrote:

On May 2, 2013, at 5:02 AM, Alan Bateman wrote:

On 02/05/2013 02:34, Tim Buktu wrote: :

Alan is working on an improved BigInteger.toString() that should be dramatically faster for large numbers. What's the deadline for getting this in? Thanks, Tim Here's the latest milestone dates: http://openjdk.java.net/projects/jdk8/milestones Given the size of the patch then it would be great to get it in as early as possible. With the review effort then I assume Feature Complete is too tight although I know Brian Burkhalter is anxious to get it in as soon as possible. I can't comment on the BigInteger.toString work to know how big this is. I think that's the best answer that can be given: as soon as possible. The 4837946 patch is far from simple to comprehend to say the least and then there is the testing so it's not going to be fast. If the toString() work is ready early enough to review then perhaps we can handle it, but I don't want to make anyone hurry to complete something and then say we cannot do it.

Brian:

If you're feeling overwhelmed by the magnitude of the changes to BigInteger, I would very strongly suggest that you review it in stages. This e-mail proposes stages for the review of this patch.

Since I first posted these patches over 5 years ago, we were asked to make patches as small as possible so they would get reviewed more quickly. That's why my patches focused on what was a minimal but necessary subset: making multiply and pow work much faster with very simple, understandable, well-known, hard-to-get-wrong routines and simple optimizations that built on existing routines but gave very significant performance increases. (Implementing Karatsuba and Toom-Cook multiplication routines in Java is often given as a homework problem in undergrad Algorithms courses. Understanding their correctness requires nothing more than first-year Algebra.)

Today, I finished porting my recursive Schoenhage toString routines that I've been using for years in my programming language "Frink" ( http://futureboy.us/frinkdocs/ ). They are drastically, stunningly faster than the existing BigInteger.toString algorithms. I have run these through my enormous regression tests which produce over 232 GB of output with no differences.

I ram these (very large) regression tests which test multiplication and pow and toString functions in BigInteger. (They produce about 232 gigabytes of output, which I compare against known good output that was produced by previous versions of Java and by the Kaffe VM which used the GMP library. This tells you how long I've been working on this; Kaffe removed support for GMP over 4 years ago.)

It should be known that I would need to further increase the size of these tests to make a significant test of the very large numbers that trigger Schönhage-Strassen (SS) multiplication. Currently, the smallest numbers that will trigger SS multiplication are both operands >247,000 bits, or about 7718 ints. SS multiplication is used at all times when both operands are 1,249,000 bits or more. These are biggish numbers, and my tests currently don't go out that far. I can vouch strongly for the correctness of Karatsuba and Toom-Cook multiplication, though.

Keep in mind that torture-testing Schönhage-Strassen multiplication and comparing against the runs of Java 1.7 or earlier might take a very long time because the earlier versions of BigInteger.multiply(BigInteger) are so slow in multiplications of the size where SS multiplication is used.

I can personally vouch for having very strongly tested the parts of multiplication and pow that I wrote, which include:

While I have been using Tim's Schönhage-Strassen multiplication and Burnickel-Ziegler division routines in my daily work, and have not detected failures, I have not written explicit tests for division at all, and my number of very large integer divisions has been small. I have implicitly used his division routines in my base conversion routines for large numbers, and have found no errors.

My multiplication tests were written to test general multiplication, but also to very strongly test around the regions at which Karatsuba and Toom-Cook have edge cases. Since I don't know the Schönhage-Strassen algorithms, I don't know where they might have particular edge cases, so I haven't written any specific tests for them.

Brian, would it be possible to make BigInteger.square() public? It would make it possible for end-users to write algorithms that performed notably faster. If not, a possible (much less optimal) optimization would be for multiply to detect that its arguments are equal (either using == which would be fast or .equals or .compareTo which may be much slower if the numbers are large) and call squaring routines instead of doing the naive multiply.

If reviewing time is limited, I might suggest performing the reviews and integration in a few steps:

Step 1) would be integrating the simple, high-benefit, low-risk changes that I made:

these require the helper functions:

It should be noted that the pow() function performs some arguments (e.g. when the base is a power of 2) millions or billions of times faster than current Java and would alone drastically improve the performance of many programs.

Step 2) incorporating my toString changes. It's useless to work with these very big numbers if you can't output them in reasonable time. And right now toString() will be the bottleneck for most programs as it's approximately O(n^2.0634) in my tests. With the improved division routines that Tim Buktu wrote, and my recursive base conversion algorithms, I can make these perform in O(n^1.3596) or better. This means, as a practical matter, that printing the largest known Mersenne number in base 10 is reduced from 18 hours to 150 seconds. (I can provide constants to the asymptotes if desired.)

The bug for improving toString is:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4641897

I considered toString() slightly more controversial because to perform repeated conversions efficiently it is very useful to allocate a cache of "helper" values of 10^2^n to speed up calculations. This had memory usage implications, synchronization implications, and questions about conversions to different bases (say, to convert to base 5, you also need a cache of 5^2^n, etc.) I didn't want this to delay the acceptance of the simple and noncontroversial multiplication and pow routines which had no external impact and didn't touch other files like MutableBigInteger, SignedMutableBigInteger, etc.

My complete patch contains very much faster, simple Schoenhage recursive base conversion routines for toString.

By the way, the synchronization for the new toString() routines could be rewritten to use Future and other synchronization mechanisms. In any case. the current mechanism will almost always be faster.

While I have been using Tim's Schönhage-Strassen multiplication and Burnickel-Ziegler division routines in my daily work, and have not detected failures, I have not written explicit tests for division at all, and my number of very large integer divisions has been small. My multiplication tests were written to test general multiplication, but also to very strongly test around the regions at which Karatsuba and Toom-Cook have edge cases. Since I don't know the Schönhage-Strassen algorithms, I don't know where they might have particular edge cases, so I haven't written any specific tests for them.

Brian, would it be possible to make BigInteger.square() public? It would make it possible for end-users to write algorithms that performed notably faster. If not, a possible (much less optimal) optimization would be for multiply() to detect that its arguments are equal (either using == which would be fast or .equals or .compareTo which may be much slower if the numbers are large) and call squaring routines instead of doing the naive multiply.

Step 3) would involve faster division: The Burnickel-Ziegler and Barrett division routines that Tim wrote. These are important for making base conversion faster, and making division reasonably fast. (For many programs, division speed is the limiting factor. This limits the performance of BigDecimal too.) They are complicated.

Step 4) integrating what I believe are the most tricky routines: Schoenhage-Strassen (SS) multiplication. I respect Tim Buktu's programming skills, but I also know that SS multiplication is hard to get right, even for super-geniuses. The GMP team historically released versions of SS multiplication that were wrong for a small subset of arguments. This stuff is hard.

My latest best version of all of these routines is located at:

http://futureboy.us/temp/BigInteger.java

This passes all of my regression tests for multiply in the Karatsuba and Toom-Cook regions, and all base conversion test. It does not test multiplication in the Schoenhage-Strassen regions, nor does it test large division.

-- Alan Eliasen eliasen at mindspring.com http://futureboy.us/



More information about the core-libs-dev mailing list