Introduction

    There are hundreds of different algorithms for factoring numbers. Each have their advantages and disadvantages, but they typically have similar approaches. In this page, I will introduce each methodology first, then discuss common factoring algorithms based on the methodology.

Trial Division

    Trial division will always have its place in factoring, regardless of advances in algorithms. If a number is divisible by a small prime (say less than 100000), this is typically the fastest way to find it. Most packages for factoring numbers will start with a trial division phase up to some search limit.

x^2 - y^2

    Everyone should remember the simple rule from Algebra that (x^2 - y^2) = (x-y)(x+y). So, if you can find a representation of the number you wish to factor, as a difference of squares, you can immediately find two factors (x-y) and (x+y). This is the basis for Fermat's Factoring Method. It works extremely well when there are two factors for a given number that are really close together (i.e. close to the square-root of N.)

Fermat's Factoring Method

    Fermat's method seeks to find a representation of N in the form (x^2 - y^2), where you immediately have the decomposition N = (x-y)(x+y). You can implement this in several ways, such as iteratively adding a square to N until the result is a square: n+x^2 = y^2. Or, you could start x = 1+|sqrt(N)|, and iteratively search for some i such that (x+i)^2 - N = y^2. Although this method is extremely fast to factor numbers where the factors are 'close' to Sqrt(N), its running time is very poor when they are not. Still, if you are implementing a generic factoring routine, you should run a few iterations of this method just in case the factors happen to be close to Sqrt(N).

Legendre's Congruence

    Legendre's Congruence is when you have x and y such that x^2 = y^2 (mod n). It is basically the same as the x^2-y^2 setup above, but instead of x^2-y^2 = N, we have x^2-y^2 = k*N. If you have such a congruence, then there is a good chance that the factorization of the polynomial on the left (x-y)(x+y) will "split" the factors of N. That is (x-y) = r*p and (x+y) = s*q, where N=p*q. In that case, all we need to do to find a factor is compute the GCD of (x-y) or (x+y) with N.  E.g. GCD(x+y, N). However, if no factorization is found, you have to find another representation and try again. It is important to say that the most advanced and effective algorithms today for factoring large numbers, are based on this idea. E.g. Shank's SQUFOF, the Continued Fraction method, the Quadratic Sieve (QS), and the Number Field Sieve (NFS). However, each method has its own way of generating Lengendre congruences.

Quadratic Sieve (QS)

    The Quadratic Sieve is a common-sense approach for finding solutions x^2 = y^2 (mod n). For most x's you test, the result (x^2 mod n) will not be a square. But, when it isn't, you can think of this result as a 'piece of a square'. All you need to do is take these 'pieces' and combine them together to form a whole square. An example will make this method very obvious. Let's say we test various squares mod n, and find two, x and y, that have the following factorizations mod n:

x^2 = a*b^2 (mod n)

y^2 = a*c^2 (mod n)


Now, each result by itself doesn't give us a Legendre Congruence. But, if we combine these results together we get:

(x*y)^2 = (a*b*c)^2 (mod n)


And viola, we have a Legendre Congruence. Now, we could try factoring N by taking GCD(xy+abc, N).

    The above is a very simplistic model. In practice, you may need to combine hundreds of these partial results to get a solution. There are simple tricks using Matrices and simple Linear Algebra to find combinations that combine to make a perfect square. Normally, you don't collect every x^2 mod n result, and instead only keep those that are 'smooth', divisible by only small primes. You typically choose a 'factor base' which is basically a set of small primes, that you use to decompose the trial x's. If the result doesn't decompose with just primes from the factor base, you skip it and choose another. There are also many variations of the Quadratic Sieve, such as using different polynomials than 'x^2', the 'Large Prime' variation (extending the factor base to include various large primes found during trial factorizations), and 'Early Abort' strategy (skipping number heuristically based on whether or not it is divisible by some threshold of small primes). The Multiple Polynomial Quadratic Sieve (MPQS) is the fastest algorithm available today for factoring general numbers up to 130 digits. It's only rival is the General Number Field Sieve (GNFS), which is currently faster for numbers exceeding 130 digits.

Number Field Sieve (NFS)

   The Number Field Sieve is an efficient way of factoring large numbers of the form N = r^e + s, where r and s are relatively small an e possibly large. It is mostly just like the Quadratic Sieve, except that instead of using rational integers, it uses integers from a "cleverly chosen" algebraic field (which depends on the form of N). The best example I've seen is in a book by Hans Reisel.

Pollard (p-1) method

   The Pollard (p-1) method is an obvious approach for anyone who knows Fermat's Little Theorem (a^(p-1) = 1 (mod p)), and is interested in factoring numbers. Consider the following congruence:

2^x = c (mod n)

Now, let's imagine n being composite with some prime factor 'p'. Given the above congruence, we also know...

2^x = c (mod p)

However, if x is divisible by (p-1) then using Fermat's Little Theorem we can simplify the latter to:

2^x = 1 (mod p)

Or

2^x - 1 = k*p

So, by taking GCD(2^x - 1, N), we arrive at the factor 'p'. With this construct, all that is needed is to compute some x that is divisible by (p-1). This is typically done by hoping (p-1) consists of only small prime factors, and taking x to be a product of small primes. In practice, you don't actually compute the entire product for 'x'. Instead, you just iteratively build up the result. For example, what we might want is something like this:

2^(2*3*5*7*11*13*...) = c (mod n)

But, instead of computing c directly, taking x as the entire product of primes, you can iteratively compute it as follows:

2^3 = z[0] (mod n)
z[0]^5 = z[1] (mod n)
z[1]^7 = z[2] (mod n)
...
z[i]^q = z[i+1] (mod n)

With this approach you are basically throwing in more and more primes into the product x, hoping that it will be become divisible by (p-1). See my NTL routines page for an example implementation.

Elliptic Curve Method (ECM)

    Considered a generalization of the (p-1) method. Instead of using a multiplicative group (i.e. the decomposition of (p-1) into a product of small primes), you use a group of points on an elliptic curve. With each curve you choose, you are basically doing the same algorithm as (p-1), but since you can choose different curves and thus different groups, you have a better chance of finding a 'smooth' group. For a basic description of ECM along with some example code, see Mark Herkommer's book.

Rho Method

    Th rho method factors numbers by finding the period of a chosen function mod n. Typically this function is x^2+a, where a does not equal 0 or 2, and the technique involved in finding the period is 'Floyd's cycle-finding algorithm.' Basically, you just iteratively calculate the x_i and x_2i elements of this function and wait for them to be equal. See my NTL routines page for an example implementation.


Joe K. Crump | Number Theory Home