# Fermat Factorization

Proposition. Let $n$ be a positive odd integer. There is a one-to-one correspondence between factorizations of $n$ in the form $n=ab$ , where $a\geq b>0$, and the representations of $n$ in the form $t^2-s^2$, where $s$ and $t$ are nonnegative integers. The correspondence is given by the equation
$$t=\frac{a+b}{2},\ s=\frac{a-b}{2};\ a=t+s,\ b=t-s$$

Proof. Given $n=ab$, we can write
$$n=ab=\left(\frac{a+b}{2}\right)^2-\left(\frac{a-b}{2}\right)^2$$
Conversely, given $n=t^2-s^2$, $n$ can be factored as $n=(t+s)(t-s)$.

If $n=ab$ with $a$ and $b$ close together, then $s=\frac{a-b}{2}$ is small, and so $t=\frac{a+b}{2}$ is slightly larger than $\sqrt{n}$. In that case, we can find $a$ and $b$ by trying all values of $t$ starting with $[\sqrt{n}]+1$, until we find one for which $t^2-n=s^2$ is a perfect square. This method is called the Fermat factorization.

Example. Factor $200819$.

Solution. $[\sqrt{200819}]+1=449$.
\begin{align*} 449^2-200819&=782,\ \mbox{not a perfact square}\\ 450^2-200819&=1681=41^2 \end{align*}
Hence,
$$200819=450^2-41^2=(450+41)(450-41)=491\cdot 409$$

If $a$ and $b$ are not close together for $n=ab$, although the Fermat factorization method will eventually find $a$ and $b$, one will have to try a large number of $t=[\sqrt{n}]+1, [\sqrt{n}]+2,\cdots$. There is a generalization of Fermat factorization. Choose small $k$, successively set $t=[\sqrt{kn}]+1, [\sqrt{kn}]+2,\cdots$, etc. until we find a $t$ for which $t^2-kn=s^2$ is a perfect square. Then $(t+s)(t-s)=kn$ and a nontrivial common factor of $n$ can be found by calculating $(t+s,n)$.

Example. Factor 141467.

Solution. First we factor 141467 by simple Fermat factorization. Since $\sqrt{141467}\approx 376.1209911717239$, we successively try $t=377,378,\cdots$ until we find $t^2-141467$ is a perfect square. We find
$$t^2-141467=414^2-141467=29929=173^2$$
Thus,
$$141467=414^2-173^2=(414+173)(414-173)=587\cdot 241$$
Now, this time we factor 141467 by generalized Fermat factorization with $k=3$. We try $t=[\sqrt{3n}]+1=652, 653,\cdots$ and find
$$t^2-3n=655^2-3\cdot 141467=4624=68^2=s^2$$
$(t+s,n)=(723,141467)=241$ and so $141467=241\cdot 587$.

The following two propositions tell us that it is not a good idea to choose an even number for $k$ in generalized Fermat factorization.

Proposition. If $k=2$, or if $k$ is any integer divisible by 2 but not by 4, then we cannot factor a large odd $n$ using generalized Fermat factorization with this choice of $k$.

Proof. $n$ is an odd integer, so $n=2m+1$ for some $m\in\mathbb{Z}$. For $k=2$, $kn=4m+2\equiv 4\mod 4$. If $k$ is an integer divisible by 2 but not by 4, $k=2(2l+1)=4l+2$ for some $l\in\mathbb{Z}$ and $kn\equiv 2\mod 4$. $t^2-s^2=kn\equiv 2\mod 4$, but the difference of two squares cannot be 2 modulo 4. (Each of the integers $t$ and $s$ is one of the forms $4u$, $4u+1$, $4u+2$, or $4u+3$ for some $u\in\mathbb{Z}$, so there are 16 different cases of $t$ and $s$ and in each case you can easily check if $t^2-s^2$ can be 2 modulo 4. For instance if $t=4u_1+1$ and $s=4u_2+2$, then
$t^2-s^2\equiv 1^2-2^2\equiv 1\mod 4$.)

Proposition. If $k=4$, and if generalized Fermat factorization works for a certain $t$, then simple Fermat factorization (with $k=1$) would have worked equally well.

Proof. $t^2-s^2=kn=4n\equiv 4\mod 8$ which can hold only if both $s$ and $t$ are even. In that case, $\left(\frac{s}{2}\right)^2=\left(\frac{t}{2}\right)^2-n$, so simple Fermat factorization would have worked equally well.

# Factoring by the Monte Carlo Method

The Monte Carlo method is a computational simulation scheme, originally introduced by Stanisław Ulam, that solves a wide variety of problems arising in chemistry, economics, finance, mathematics, physics, etc. In this note, we discuss an application of the Monte Carlo method in factoring of integers. It is also called rho method and was introduced by J. M. Pollack.

The first step is to choose an easily evaluated map from $\mathbb{Z}_n$ to itself. A popular choice is $f(x)=x^2+1$. Next, one chooses an initial value $x=x_0$, and then computes the successive iterates of $f$: $x_1=f(x_0)$, $x_2=f(x_1)$, $x_3=f(x_2)$, etc. i.e. $x_{j+1}=f(x_j)$, $j=0,1,2,\cdots$. Compare the $x_j$’s, hoping to find two which are different modulo $n$ but the same modulo some divisor of $n$. Once we find such $x_i$, $k_k$, we have $(x_k-x_j,n)$ equal to a proper divisor of $n$

Example. Let us factor $91$ by choosing $f(x)=x^2+1$, $x_0=1$.
\begin{align*} x_1&=f(x_0)=2\\ x_2&=f(2)=5\\ x_3&=f(5)=26\\ &\vdots \end{align*}
Since $(x_3-x_2,n)=(21,91)=7$, 7 is a factor.

The method works by successively computing $x_k=f(x_{k-1})$ and comparing $x_k$ with the earlier $x_j$ until we find a pair satisfying $(x_k-x_j,n)=r>1$. But as $k$ gets larger, it becomes more time consuming to compute $(x_k-x_j,n)$ for each $j<k$. Note that once there is a pair $(k_0,j_0)$ such that $x_{k_0}\equiv x_{j_0} \mod r|n$, we have the same relation $x_k\equiv x_j\mod r$ for any pair $(j,k)$ such that $k-j=k_0-j_0$: Set $k=k_0+m$ and $j=j_0+m$, and apply $f$ to both sides of the congruence $x_{k_0}\equiv x_{j_0}\mod r$ repeatedly $m$ times.

The previous algorithm can be modified so that we need to calculate the gcd only once for each $k$. This significantly reduces the required computational burden. Here is the modified algorithm.

We successively compute the $x_k$. For each $k$, we proceed as follows. Suppose $k$ is an $(h+1)$-bit integer, i.e. $2^h\leq k<2^{h+1}$. Let $j$ be the largest $h$-bit integer: $j=2^h-1$. We compare $x_k$ with this particular $x_j$, i.e. compute $(x_k-x_j,n)$. If this gcd gives a nontrivial factor of $n$, stop. Otherwise continue on to $k+1$.

Example. $n=91$, $f(x)=x^2+1$, $x_0=1$.
\begin{align*} x_1&=f(1)=2\\ x_2&=f(2)=5;\ (x_2-x_1,n)=(5-2,91)=1\\ x_3&=f(5)=26;\ (x_3-x_1,n)=(24,91)=1\\ x_4&=f(26)=26^2+1\equiv 40\mod 91;\ (x_4-x_3,n)=(14,91)=7 \end{align*}

Example. Factor 4087 using $f(x)=x^2+x+1$ and $x_0=2$.
\begin{align*} x_1&=f(2)=7;\ (x_1-x_0,n)=(7-2,4087)=1\\ x_2&=f(7)=57;\ (x_2-x_1,n)=(57-7,4087)=1\\ x_3&=f(57)=3307;\ (x_3-x_1,n)=(3307-7,4087)=1\\ x_4&=f(3307)\equiv\mod 4087;\ (x_4-x_3,n)=(2745-3307,4087)=1\\ x_5&=f(2745)\equiv 1343\mod 4087; (x_5-x_3,n)=(1343-3307,4087)=1\\ x_6&=f(1343)\equiv 2626\mod 4087;\ (x_6-x_3, n)=(2626-3307,4087)=1\\ x_7&=f(2626)\equiv 3734\mod 4087;\ (x_7-x_3,n)=(3734-3307,4087)=61 \end{align*}
Hence, $61$ is a factor of $4087$ and $4087=61\cdot 67$.

# Computing for Number Theory 1: GCD, LCD, Linear Diophantine Equations

Here, we study some computing problems that are related to Divisibility and the Division Algorithm.

#### Maxima

GCD: Maxima has a function to compute the GCD (Greatest Common Divisor) gcd(a,b).

Example. Find the GCD of 158 and 188.

(%i1) gcd(158,188);

(%o1) 2

LCM: Maxima has a function to compute the LCM (Least Common Multiple) lcm(express_1,...express_n). This function needs to be loaded by running the command load("functs").

Example. Find the LCD of 6, 8, 15, 32.

(%i1) load("functs")$ (%i2) lcm(6,8,15,32); (%o2) 480 Bézout’s Lemma (see here) says that given two integers$a,b$, there is an integer solution$(x,y)$to the equation$ax+by=(a,b)$. The Maxima command gcdex implements the Euclidean algorithm. gcdex(a,b) returns [x, y, u] where$u=(a,b)$and$ax+by=u$. So, one can find a solution of the equation$ax+by=(a,b)$using gcdex. Example. Solve$2=158x+188y$. We saw earlier that$(158,188)=2$. (%i2) gcdex(158,188); (%o2) [25, - 21, 2] Hence,$x=25$and$y=-21$. One can also use the command igcdex. gcdex and igcdex both implement the Euclidean algorithm. The only difference between them is that the argument of igcdex must be integers while the arguments of gcdex can be also polynomials. For instance, (%i7) gcdex(x^2 + 1,x^3 + 4); (%o7)/R/ [$\displaystyle\frac{x^2+4x-1}{17}$,$\displaystyle\frac{x+4}{17}$, 1] #### Python GCD: Algorithm 1. The name of the Python code is gcd(a,b) and it implements the Euclidean algorithm to find the GCD of two integers$a$and$b$. 2. If b == 0 then gcd(a,b) returns 0 because every nonzero integer divides 0. 3. Otherwise, it returns gcd(b,a % b). % is the modulo or remainder operator. a % b returns the remainder when$a$is divided by$b$if$a\geq b$or$a$if$a<b$because$a=b\cdot 0+a$. For example, 12345 % 2 returns 1. 4. The loop continues until b==0. Use your favorite editor to write the following python code (it is called a Python module) and save it as, for example, gcd.py. The Python interface IDLE has own editor. I also recommend Emacs and Geany for Python editors if you are running Python in command shell with iPython. # Computing the GCD of two integers a and b recursively using the Euclidean algorithm def gcd(a,b): if (b == 0): return a return gcd(b, a%b) On IDLE, the Python module can be run from its pull-down (or drop-down) menu under Run. Its shortcut key is F5. To run it on iPython, first change your current folder location to the one that contains the Python module gcd.py using cd (change directory) command in command shell and then start iPython by typing ipython (you can first start iPython and then change your current folder location as well using cd within iPython) and pressing the enter key. Once iPython is loaded, do $ ipython
Python 3.7.9 (default, Feb 4 2021, 01:17:56)
IPython 7.19.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: run gcd.py
In [2]: gcd(98,56)
Out[2]: 14

The Euclidean Algorithm: The Euclidean algorithm itself is a perfect computer algorithm. A Python code of the Euclidean algorithm is

# This Python module runs the Euclidean Algorithm to find the GCD of two integers a and b
def euclid(a,b):
r = a % b
while r > 0:
print (a, b, r)
a, b, r = b, r, b % r
return b

Running this Python module (named euclid.py) on iPython shows