Introduction
There are hundreds of different algorithms for
factoring numbers. Each have their advantages and disadvantages,
but they typically have similar approaches. In this page,
I will introduce each methodology first, then discuss
common factoring algorithms based on the methodology.
Trial Division
Trial division will always have its place
in factoring, regardless of advances in algorithms. If a number
is divisible by a small prime (say less than 100000), this is
typically the fastest way to find it. Most packages for factoring
numbers will start with a trial division phase up to some search limit.
x^2 - y^2
Everyone should remember the simple rule from
Algebra that (x^2 - y^2) = (x-y)(x+y). So, if you can find a representation
of the number you wish to factor, as a difference of squares, you can
immediately find two factors (x-y) and (x+y). This is the basis for
Fermat's Factoring Method. It works extremely well when there are
two factors for a given number that are really close together (i.e.
close to the square-root of N.)
Fermat's Factoring Method
Fermat's method seeks to find a representation
of N in the form (x^2 - y^2), where you immediately have the
decomposition N = (x-y)(x+y). You can implement this in several ways,
such as iteratively adding a square to N until the result is a square:
n+x^2 = y^2. Or, you could start x = 1+|sqrt(N)|, and iteratively
search for some i such that (x+i)^2 - N = y^2. Although this method
is extremely fast to factor numbers where the factors are 'close'
to Sqrt(N), its running time is very poor when they are not. Still,
if you are implementing a generic factoring routine, you should
run a few iterations of this method just in case the factors happen
to be close to Sqrt(N).
Legendre's Congruence
Legendre's Congruence is when you have
x and y such that x^2 = y^2 (mod n). It is basically the same
as the x^2-y^2 setup above, but instead of x^2-y^2 = N, we
have x^2-y^2 = k*N. If you have such a congruence, then there
is a good chance that the factorization of the polynomial
on the left (x-y)(x+y) will "split" the factors of N. That
is (x-y) = r*p and (x+y) = s*q, where N=p*q. In that case,
all we need to do to find a factor is compute the GCD
of (x-y) or (x+y) with N. E.g. GCD(x+y, N). However, if no
factorization is found, you have to find another representation
and try again. It is important to say that the most advanced
and effective algorithms today for factoring large numbers, are
based on this idea. E.g. Shank's SQUFOF, the Continued Fraction
method, the Quadratic Sieve (QS), and the Number Field Sieve (NFS).
However, each method has its own way of generating Lengendre congruences.
Quadratic Sieve (QS)
The Quadratic Sieve is a common-sense
approach for finding solutions x^2 = y^2 (mod n). For
most x's you test, the result (x^2 mod n) will not be a
square. But, when it isn't, you can think of this result
as a 'piece of a square'. All you need to do is take
these 'pieces' and combine them together to form a whole square.
An example will make this method very obvious. Let's say we
test various squares mod n, and find two, x and y, that have
the following factorizations mod n:
x^2 = a*b^2 (mod n)
y^2 = a*c^2 (mod n)
Now, each result by itself doesn't give us a Legendre Congruence.
But, if we combine these results together we get:
(x*y)^2 = (a*b*c)^2 (mod n)
And viola, we have a Legendre Congruence. Now, we could
try factoring N by taking GCD(xy+abc, N).
The above is a very simplistic model. In
practice, you may need to combine hundreds of these partial
results to get a solution. There are simple tricks using
Matrices and simple Linear Algebra to find combinations
that combine to make a perfect square. Normally, you don't
collect every x^2 mod n result, and instead only keep those
that are 'smooth', divisible by only small primes. You typically
choose a 'factor base' which is basically a set of small primes,
that you use to decompose the trial x's. If the result doesn't
decompose with just primes from the factor base, you skip it
and choose another. There are also many variations of the
Quadratic Sieve, such as using different polynomials than 'x^2',
the 'Large Prime' variation (extending the factor base to include
various large primes found during trial factorizations), and
'Early Abort' strategy (skipping number heuristically based on
whether or not it is divisible by some threshold of small primes).
The Multiple Polynomial Quadratic Sieve (MPQS) is the fastest
algorithm available today for factoring general numbers up to
130 digits. It's only rival is the General Number Field Sieve (GNFS),
which is currently faster for numbers exceeding 130 digits.
Number Field Sieve (NFS)
The Number Field Sieve is an efficient
way of factoring large numbers of the form N = r^e + s,
where r and s are relatively small an e possibly large. It
is mostly just like the Quadratic Sieve, except that instead
of using rational integers, it uses integers from a "cleverly
chosen" algebraic field (which depends on the form of N). The
best example I've seen is in a book by Hans Reisel.
Pollard (p-1) method
The Pollard (p-1) method is an
obvious approach for anyone who knows Fermat's Little
Theorem (a^(p-1) = 1 (mod p)), and is interested in
factoring numbers. Consider the following congruence:
2^x = c (mod n)
Now, let's imagine n being composite with some prime factor 'p'.
Given the above congruence, we also know...
2^x = c (mod p)
However, if x is divisible by (p-1) then using Fermat's
Little Theorem we can simplify the latter to:
2^x = 1 (mod p)
Or
2^x - 1 = k*p
So, by taking GCD(2^x - 1, N), we arrive at the factor 'p'.
With this construct, all that is needed is to compute
some x that is divisible by (p-1). This is typically done
by hoping (p-1) consists of only small prime factors,
and taking x to be a product of small primes. In practice,
you don't actually compute the entire product for 'x'.
Instead, you just iteratively build up the result. For
example, what we might want is something like this:
2^(2*3*5*7*11*13*...) = c (mod n)
But, instead of computing c directly, taking x as the
entire product of primes, you can iteratively compute it
as follows:
2^3 = z[0] (mod n)
z[0]^5 = z[1] (mod n)
z[1]^7 = z[2] (mod n)
...
z[i]^q = z[i+1] (mod n)
With this approach you are basically throwing in more
and more primes into the product x, hoping that it
will be become divisible by (p-1). See my NTL routines
page for an example implementation.
Elliptic Curve Method (ECM)
Considered a generalization of
the (p-1) method. Instead of using a multiplicative
group (i.e. the decomposition of (p-1) into a product
of small primes), you use a group of points on an
elliptic curve. With each curve you choose, you are
basically doing the same algorithm as (p-1), but since
you can choose different curves and thus different
groups, you have a better chance of finding a 'smooth' group.
For a basic description of ECM along with some example
code, see Mark Herkommer's book.
Rho Method
Th rho method factors numbers by
finding the period of a chosen function mod n. Typically
this function is x^2+a, where a does not equal 0 or 2,
and the technique involved in finding the period is
'Floyd's cycle-finding algorithm.' Basically, you just
iteratively calculate the x_i and x_2i elements of this
function and wait for them to be equal. See my NTL routines
page for an example implementation.
Joe K. Crump |
Number Theory Home