Tuesday, October 09, 2018

[tebiwyxz] Trial division before Miller-Rabin

How much trial division should be tried before doing a Miller-Rabin primality test?  It depends on the size of your number, the speed of your trial division, and the speed of your Miller-Rabin (modular exponentiation) implementation.

Doing one additional trial division by the next untried prime number typically requires constant time if you are reading the list of primes from a precalculated list, so you can estimate the time it takes to do one more by doing a bunch and computing the average.  (It's more complicated if you are generating the list of primes on the fly with a sieve or even a recursive call to Miller-Rabin.  We won't tackle this case.  Seriously, just use a precalculated list.)

Doing one additional trial division by prime p eliminates 1/p of the remaining composites.  Primes are independent (isn't that cool?), so this is true regardless of what composites have already been eliminated by prior trial division.

Measure the amount of time T for 1 Miller-Rabin failure ("definitely composite").  This is the time for one modular exponentiation.  (We ignore the situation of a composite passing the first Miller-Rabin trial and failing later, because it is rare.  We also ignore the possibility of Miller-Rabin exiting early because it found a nontrivial square root of unity.)  (Also note that the BPSW primality test also starts with a Miller-Rabin trial with base 2, so all this applies to BPSW.)  Doing one additional trial division eliminates 1/p composites, so avoids 1/p of these Miller-Rabin time costs, so the expected time savings of the Miller-Rabin phase is (1/p)*T.  But trial division takes longer because of the one additional trial division. Do trial division until net gain is zero.

In order to calculate the cutoff, you will need the size of the nth prime p as a function of n.  This can be estimated using the Prime Number Theorem as p(n)=n*log(n), or it can be precisely looked up online.  (Aren't both of these also cool?)

There will eventually be an n large enough: more trial division directly costs O(1/n) additional time but saves only O(1/(n*log n)) expected time in Miller-Rabin.

ispseudoprime() in Pari/GP does essentially trial division by primes up to (only!) 103, though very quickly using a clever application of GCD.  The fact that it does so little trial division is inadvertently useful: we can easily bolt on our own trial division filtering step beforehand.

GNU MP mpz_probab_prime_p does a variable amount of trial division, but the code has the comment: /* FIXME: tune this limit */

No comments :