## Pollard's Rho Method(转)

## Introduction

Suppose you had some large number that you knew was not a prime number and you needed to find out what its factors are. How would you go about doing that? You could try dividing it by

2,3,4,5, etc. until you find one that works. You could even optimize that a bit if you by only trying to divide by prime numbers. But, if the smallest divisor of your number is pretty large, you've got a great deal of work ahead of you.John Pollard published his Rho Method of Factorization in 1974. It takes advantage of some properties of divisors of numbers to zoom in on a factor quite quickly. Once you have one factor, you can readily break the original number into two smaller numbers to factor.

This is a brief description of the Pollard's Rho Algorithm. If you're already familiar with modulo arithmetic and greatest common divisors, you can skip ahead to

the actual algorithm.

## Modulo Arithmetic

If you're already comfortable with addition, subtraction, multiplication, and exponentiation modulo a number, feel free to skip over this whole section.

## Definition Modulo

Two numbers

xandyare defined to becongruent moduloif their differencen(x-y)is an integer multiple ofn. Another way of stating this is to say thatxandyboth have the same remainder when divided byn.For example, suppose that

x = qwhere_{x}*n + r_{x}0 <= r. And, suppose that_{x}< ny = qwhere_{y}*n + r_{y}0 <= r. Then,_{y}< nxis congruent toymodulonif and only ifr. You can see that if_{x}= r_{y}r, then_{x}= r_{y}(x-y)is justq_{x}*n - q_{y}*n + r_{x}- r_{y}= (q_{x}-q_{y}) * n

For a concrete example, suppose that

x = 37,y = -14andn = 17.xis congruent toymodulon. We know this because(x-y) = 37 + 14 = 51 = 3*17

We could also tell this becausex = 2*17 + 3andy = (-1)*17 + 3—both have a remainder of3when divided by17. By the same logic, it is also easy to see that bothxandyare congruent to3since3 = 0*17 + 3.

## Modulo Operations

We often speak of doing some operation (addition, subtraction, multiplication, etc.) “modulo

n”. This simply means, do the operation and then rewrite the answer to be the number that's at least0, is less thann, and is congruent modulonwith the answer.For example,

37 + 15 = 1modulon. This is often abbreviated like this:37 + 15 = 1 (mod n)

Conveniently, one can take any number in a modulo equation and replace it with any number which is congruent to it. It is usually convenient to pick the smallest positive number which foots the bill. So, we could redo the equation37 + 15 = 1without having to add those huge numbers37and15or to divide that huge sum of52by17. Instead, we could replace the37with3because they are congruent with each other modulo17. So,37 + 15 = 3 + 15 = 18 = 1 (mod n)

The same thing holds for subtraction and for multiplication and for exponentiation. So, it is easy to see that

37modulo^{4}= 1317. We simply replace the37by3. Then, we break up the exponentiation a bit.37^{4}= 3^{4}= ( 3^{3})*3 = 27 * 3 = 10 * 3 = 30 = 13

because27 = 10and30 = 13modulo17.

## Greatest Common Divisor (Euclidean Algorithm)

For Pollard's Rho Method, one needs to be able to find the Greatest Common Divisor of two numbers. The

Greatest Common Divisoris the largest number which divides evenly into each of the original numbers. The Greatest Common Divisor ofaandbis often written “gcd(a,b)”. Sometimes, you will see it written simply as “(a,b)”.The Greatest Common Divisor is symmetric. This is

gcd( a,b) = gcd(b,a)

The usual method for finding the Greatest Common Divisor is the Euclidean Algorithm. The

Euclidean Algorithmgoes like this.... Start with the numbersaandb. Expressaas a multiple ofbplus a remainderrwhich is greater than or equal to zero and less thanb. Ifris greater than zero, setaequal toband setbequal tor. Lather. Rinse. Repeat.As you can see,

rdecreases every iteration until it reaches zero. On the first pass, it cannot be as big asb. On the second pass, it cannot be as big as it was on the first pass. On then-th pass, it cannot be as big as it was on the previous pass. Eventually,rhas to get to zero. When it does, thenb(which was the previousr) is the Greatest Common Divisor of the originalaandb. [Actually, if the originalbis some multiple of the originala, then the firstrwill be zero. In that case, the Greatest Common Divisor will actually beainstead ofb. You can avoid this problem by always starting withabeing the number which has the highest absolute value.]For example, let us find the Greatest Common Divisor of

658and154. This leads to the following sequence of equations.658 = 4 * 154 + 42

154 = 3 * 42 + 28

42 = 1 * 28 + 14

28 = 2 * 14 + 0

Which means that14is the greatest common divisor of658and154.You can see that

14divides evenly into154and168by propagating back up the that list of equations. The last equation shows that14divides into28. The equation before that shows that42is some multiple of something which is a multiple of14with an extra14tacked on the end. This means that42is a multiple of14since it is the sum of two things which are multiples of14. The equation above that shows that154is the sum of a multiple of42and28. Both42and28are multiples of14, so154is also a multiple of14. And, the last equation, by similar logic, shows that658is divisible by14.Unfortunately, in the preceding paragraph, we only managed to show that

14was a common divisor of658and154. We didn't show that it was necessarily the largest divisor common to both. That part is more complicated. At the time of writing here, I don't feel like getting into that. You can find ample documentation of the Euclidean Algorithm in text books and on the web if you're interested in that part of the proof.

## Pollard's Rho Method

Now, on to the actual algorithm.

## The Algorithm

Say, for example, that you have a big number

nand you want to know the factors ofn. Let's use16843009. And, say, for example, that we know thatnis a not a prime number. In this case, I know it isn't because I multiplied two prime numbers together to maken. (For the crypto weenies out there, you know that there a lot of numbers lying around which were made by multiplying two prime numbers together. And, you probably wouldn't mind finding the factors of some of them.) In cases where you don't know, a priori, that the number is composite, there are a variety of methods to test for compositeness.Let's assume that

nhas a factord. Since we knownis composite, we know that there must be one. We just don't know what its value happens to be. But, there are some things that we do know aboutd. First of all,dis smaller thann. In fact, there are at least somedwhich are no bigger than the square root ofn.So, how does this help? If you start picking numbers at random (keeping your numbers greater or equal to zero and strictly less than

n), then the only time you will geta = bmodulonis whenaandbare identical. However, sincedis smaller thann, there is a good chance thata = bmodulodsometimes whenaandbare not identical.Well, if

a = b (mod d), that means that(a-b)is a multiple ofd. Sincenis also a multiple ofd, the greatest common divisor of(a-b)andnis a positive, integer multiple ofd. So, now, we can dividenby whatever this greatest common divisor turned out to be. In doing so, we have broken downninto two factors. If we suspect that the factors may be composite, we can continue trying to break them down further by doing the algorithm again on each half.The amazing thing here is that through all of this, we just knew there had to be some divisor of

n. We were able to use properies of that divisor to our advantagebeforewe even knew what the divisor was!This is at the heart of Pollard's rho method. Pick a random number

a. Pick another random numberb. See if the greatest common divisor of(a-b)andnis greater than one. If not, pick another random numberc. Now, check the greatest common divisor of(c-b)andn. If that is not greater than one, check the greatest common divisor of(c-a)andn. If that doesn't work, pick another random numberd. Check(d-c),(d-b), and(d-a). Continue in this way until you find a factor.As you can see from the above paragraph, this could get quite cumbersome quite quickly. By the

k-th iteration, you will have to do(k-1)greatest common divisor checks. Fortunately, there is way around that. By structuring the way in which you pick “random” numbers, you can avoid this buildup.Let's say we have some polynomial

f(x)that we can use to pick “random” numbers. Because we're only concerned with numbers from zero up to (but not including)n, we will take all of the values off(x)modulon. We start with somex. We then choose to pick our random numbers by_{1}x._{k+1}= f(x_{k})Now, say for example we get to some point

kwherexmodulo_{k}= x_{j}dwithk < j. Then, because of the way that modulo arithmetic works,f(xwill be congruent to_{k})f(xmodulo_{j})d. So, once we hit uponxand_{k}x, then each element in the sequence starting with_{j}xwill be congruent modulo_{k}dto the corresponding element in the sequence starting atx. Thus, once the sequence gets to_{j}xit has looped back upon itself to match up with_{k}x(when considering them modulo_{j}d).This looping is what gives the Rho method its name. If you go back through (once you determine

d) and look at the sequence of random numbers that you used (looking at them modulod), you will see that they start off just going along by themselves for a bit. Then, they start to come back upon themselves. They don't typically loop the whole way back to the first number of your sequence. So, they have a bit of a tail and a loop---just like the greek letter “rho”.Before we see why that looping helps, we will first speak to why it has to happen. When we consider a number modulo

d, we are only considering the numbers greater than or equal to zero and strictly less thand. This is a very finite set of numbers. Your random sequence cannot possibly go on for more thandnumbers without having some number repeat modulod. And, if the functionf(x)is well-chosen, you can probably loop back a great deal sooner.The looping helps because it means that we can get away without piling up the number of greatest common divisor steps we need to perform. In fact, it makes it so that we only need to do one greatest common divisor check for every second random number that we pick.

Now, why is that? Let's assume that the loop is of length

tand starts at thej-th random number. Say that we are on thek-th element of our random sequence. Furthermore, say thatkis greater than or equal tojandtdividesk. Becausekis greater thanjwe know it is inside the looping part of the Rho. We also know that iftdividesk, thentalso divides2*k. What this means is thatxand_{2*k}xwill be congruent modulo_{k}dbecause they correspond to the same point on the loop. Because they are congruent modulod, their difference is a multiple ofd. So, if we check the greatest common divisor of(xwith_{k}-x_{k/2})nevery time we get to an evenk, we will find some factor ofnwithout having to dok-1greatest common divisor calculations every time we come up with a new random number. Instead, we only have to do one greatest common divisor calculation for every second random number.The only open question is what to use for a polynomial

f(x)to get some random numbers which don't have too many choices modulod. Since we don't usually know much aboutd, we really can't tailor the polynomial too much. A typical choice of polynomial isf(x) = x^{2}+ a

whereais some constant which isn't congruent to0or-2modulon. If you don't place those restrictions ona, then you will end up degenerating into the sequence{ 1, 1, 1, 1, ... }or{ -1, -1, -1, -1, ... }as soon as you hit upon somexwhich is congruent to either1or-1modulon.

## Examples

Let's use the algorithm now to factor our number

16843009. We will use the sequencexwith_{1}=1x(where the function is calculated modulo_{n+1}= 1024*x_{n}^{2}+ 32767n). [ I also tried it with the very basic polynomialf(x) = x, but that one went 80 rounds before stopping so I didn't include the table here.]^{2}+ 1

kx_{k}gcd ( n, x_{k}-x_{k/2})1 1 2 33791 1 3 10832340 4 12473782 1 5 4239855 6 309274 0 7 11965503 8 15903688 1 9 3345998 10 2476108 0 11 11948879 12 9350010 1 13 4540646 14 858249 0 15 14246641 16 4073290 0 17 4451768 18 14770419 257 Let's try to factor again with a different random number schema. We will use the sequence

xwith_{1}=1x(where the function is calculated modulo_{n+1}= 2048*x_{n}^{2}+ 32767n).

kx_{k}gcd ( n, x_{k}-x_{k/2})1 1 2 34815 1 3 9016138 4 4752700 1 5 1678844 6 14535213 257 There is an art to picking the polynomial. I don't know that art at all. I tried a couple of polynomials until I found one that zoomed in relatively quickly. If I had to factor something with this method, I would generate a few small polynomials at random and try them out in parallel until one of them found a factor or I got tired of waiting

copy from:http://www.csh.rit.edu/~pat/math/quickies/rho/#algorithm

posted on 2009-11-04 23:43 abilitytao 阅读(1311) 评论(0) 编辑 收藏 引用