(2 x prime) + 1 = ? :The 200-year-old story of Sophie Germain and its 21st century legacy

by Qingbo Wang
figures by Anna Maurer

In 2014, the Fields Medal, often described as the “Nobel Prize of Mathematics,” was awarded to a Harvard alumnus and mathematician, Maryam Mirzakhani. The mathematics society celebrated not only her sophisticated theory of geometry and dynamical systems, but also the fact that she is the first female and Iranian to receive this prestigious prize.

Although the news indicates the unbalanced composition that still plagues the science community, women have been making significant contributions to the field of mathematics through its long history. Sophie Germain is another prominent example a woman defying all odds to pursue her passion in a male-dominated discipline, eventually succeeding in making a lasting impact in the field. Her insights were instrumental in advancing mathematics in her day. Some of her discoveries are still central to important aspects of modern technology, such as Internet security.

The story of a superwoman

Back in 18th century Europe, it was considered inappropriate for women to study mathematics. Sophie Germain was born to a wealthy, upper-class French family in 1776, and at a young age, became enthralled with math after flipping through her father’s books. Defying her parents’ initial disapproval, she crawled out of bed night after night to study after everyone else had gone to sleep. Unfortunately, diligence and talent were not enough to earn her a spot in college due to her gender. Nevertheless, she persisted, borrowing lecture notes from friends and submitting brilliant answers to problem sets under the alias ‘Monsieur LeBlanc’.

Her efforts did not go unnoticed. Soon, through written communications, Monsieur LeBlanc caught the attention of some of the most prominent mathematicians of the day. The male persona did not hold up for long before the likes of Joseph Louis Lagrange and Carl Friedrich Gauss, among the greatest mathematicians of all time, uncovered her true identity. Fortunately, they were unfazed by this realization. The congeniality and support of her male colleagues partly compensated for the social disadvantages that came with gender and allowed her brilliance to come to light. Lagrange became her mentor. Gauss, impressed by her intellect, continued a productive correspondence with Sophie, though regrettably they never met.

Sophie Germain solves a mystery

Among Sophie’s chief interests was number theory, or the study of the properties of integers (e.g. -1, 0, 1, 2, 3, …, see Figure 1). One of her most influential works in the field was the formulation of “Sophie Germain’s Theorem,” which partially solved a mystery that had puzzled all of mathematics, namely Fermat’s Last Theorem, for a specific group of prime numbers [see Note 1]. That seminal contribution involved another one of her signature discoveries, now known as the Sophie Germain Primes, whose relevance has experienced a renaissance in the Internet age.

To embrace Sophie’s world of number theory and understand its application to the IT field, it will be helpful to dig a little deeper into the notion of prime numbers (“primes”), which form a subgroup of natural numbers. You use the natural numbers for counting (e.g. 1, 2, 3, …, see Figure 1) and, for simplicity, hereafter we will refer to them simply as numbers. Many numbers can be written as a product of other numbers, for instance 12 = 4 x 3. Here 3 and 4 are factors of 12. Some, however, can only be evenly divided by exactly two distinct numbers, namely 1 and the number itself. These numbers are known as primes. For example, consider the first five primes, 2, 3, 5, 7, and 11—each of them can only be divided by 1 and itself. You could try to convince yourself that any number greater than 1 can be expressed as a unique product of primes only (e.g. 12 = 2 × 2 × 3, 21 = 3 × 7, &c.). For a given number, the process of finding these prime factors is called “prime factorization” or “prime factor decomposition”.

Now, as a seemingly frivolous exercise, let us now double each of the first 5 primes and then add 1. This gives 5(= 2 × 2 + 1), 7 (= 3 × 2 + 1), 11, 15, and 23. Notice that, except for 15, the resulting numbers are also prime (see Figure 1)! The original primes that yielded new primes in this fashion are called “Sophie Germain Primes” (e.g. 2, 3, 5, 11), a set of numbers whose discovery is credited to Sophie. In general, a Sophie Germain Prime is defined as a prime “s” such that (2 × s) + 1 is also a prime.

Summary of the different categories of integer numbers: Integers are the set of negative and positive whole numbers plus zero. Natural numbers are integers that are larger than zero (i.e. all positive integers). Prime numbers are natural numbers that cannot be divided by any natural numbers other than 1 and itself (i.e all natural numbers that have exactly two distinct divisors). A Sophie Germain Prime is a prime number that satisfy the following property: when you multiply it by 2 and then add 1, you get another prime number. Finally, the new prime numbers generated in such way are called Safe Primes. — **Figure 1:** Summary of the different categories of integer numbers. Integers are the set of negative and positive whole numbers plus zero. Natural numbers are integers that are larger than zero (i.e. all positive integers). Prime numbers are natural numbers that cannot be divided by any natural numbers other than 1 and itself (i.e all natural numbers that have exactly two distinct divisors). A Sophie Germain Prime is a prime number that satisfy the following property: when you multiply it by 2 and then add 1, you get another prime number. Finally, the new prime numbers generated in such way are called Safe Primes.

Prime numbers are ubiquitously used in the field of cryptography, but some are safer than others

You might wonder how Sophie Germain Primes find their application in the contemporary IT field. Sophie herself certainly had no idea when she was wandering through the world of prime numbers. The key fact that ties primes to the world of IT is that when a number is large, we cannot easily tell whether it is prime or not. This actually means that conducting prime factor decomposition of a large number, especially if its prime factors are also large, is not always easy. As a simplified example, by hand, it takes less than a minute to calculate the product, “m”, of two large prime numbers, “p” = 2017 and “q” = 509 (2017 × 509 = 1026653). However, it is far more difficult to reverse this calculation. That is, given the number 1026653, we need to expend many more resources to find its prime factors, because we essentially need to try dividing 1026653 by all the primes preceding 509.

Many of the major crypto-systems (systems that manage your passwords) rely on this property of primes. As Figure 2 illustrates, imagine if you want your friends to send you a secret message, you let them put it in a box and “lock” it using m, the product of two primes. Everybody can see that there is a lock tagged on the box, but nobody except you can open the box without knowing the key p or q, i.e., one of the primes. You receive the box, use p (or q) to open the box, and safely receive the message. In practice, we use primes that are a lot larger than the above example (hundreds of digits long), so that even supercomputers would take more than hundreds of years to get p and q from m by trial and error. Although cryptosystems are in practice much more sophisticated, the basic underlying idea is the same: the safety of a cryptographic system depends on numbers that are difficult to prime factor [Note 2].

Figure 2: Simplified RSA Cryptosystem. In the public key cryptosystem, if your friends (left) want to send an encrypted (i.e. secret) mail to you (right), they encrypt it using the public key. This is done using a product (“m”) of large primes (“p” and “q”), and is analogous to tagging a "lock" to the box. Only you will be able to decrypt it using your private key, which uses the p or q. Since we don't need the key itself to be transmitted, it can be kept secret. This makes it nearly impossible for hackers to steal the message, analogous to not being able to open the box without the key. — **Figure 2:** Simplified RSA Cryptosystem. In the public key cryptosystem, if your friends (left) want to send an encrypted (i.e. secret) mail to you (right), they encrypt it using the public key. This is done using a product (“m”) of large primes (“p” and “q”), and is analogous to tagging a “lock” to the box. Only you will be able to decrypt it using your private key, which uses the p or q. Since we don’t need the key itself to be transmitted, it can be kept secret. This makes it nearly impossible for hackers to steal the message, analogous to not being able to open the box without the key.

So are all large prime numbers equally useful for cryptography? Not necessarily, because mathematicians could efficiently check whether a given number is prime with methods that work better for certain types of primes than others. One of them is known as “Pollard’s p − 1 algorithm”. To understand what it does, let’s again try to prime factor m into p and q. It turns out that, as long as (p-1) (or (q-1)) is a product of relatively small primes ONLY, this algorithm allows us to relatively easily find p and q even when p (or q) are themselves large primes. If, on the other hand, (p-1) or (q-1) themselves have at least one very large prime factor, the algorithm would not perform well. In other words, this means we should avoid using primes that are of the form “(product of small primes plus 1)” (e.g. 19 = (3 × 3 × 2) + 1), see also Figure 3), to avoid smart people finding out the secret key p (or q) that opens the box locked with m.

So, “safe” primes resistant to this efficient check should have the recipe “(product involving at least one large prime) plus 1”. Since all primes larger than 2 are odd, (p – 1) must be even, hence divisible by 2. Therefore, we can show with algebra that every large prime p can be written as p = (2 × n) + 1. The best way to maximize the largest prime factor in (2 × n) is to set n itself to be a prime. In other words, for a prime, p, to be considered “safe” for cryptographic purposes, n = (p – 1)/2 should ideally also be prime.

Now is finally the time to come back to the definition of a Sophie Germain Prime: fortuitously, for a Sophie Germain Prime s, (2 × s) + 1 is also a prime. Therefore, if we let p = (2 × s) + 1, then p is guaranteed to be safe. In other words, “safe primes” are always related to Sophie Germain Primes in this way. Indeed, a “safe prime” is defined as:

Safe Prime = (2 × Sophie Germain Prime) + 1

Safe primes are fundamental in the field of cryptography, which means that the Sophie Germain Primes form the foundation that underlies today’s security systems. It would be fun to imagine what Sophie would feel, if she were alive to know that her enthusiasm toward math has become a central pillar supporting 21st century technology.

Figure 3: Comparing safe prime and non-safe prime with a simple example. If we write two numbers 107 and 109 as (product of primes) + 1 and look at the primes inside the brackets, 107 contains a relatively large prime, 53, while 109 contains only 2s and 3s, the first and the second smallest existing primes. Thus 107 is safer than 109. This example also shows that the prime number being large itself does not make it safer -- 109 is larger than 107 but is less safe! — **Figure 3:** Comparing safe prime and non-safe prime with a simple example. If we write two numbers 107 and 109 as (product of primes) + 1 and look at the primes inside the brackets, 107 contains a relatively large prime, 53, while 109 contains only 2s and 3s, the first and the second smallest existing primes. Thus 107 is safer than 109. This example also shows that the prime number being large itself does not make it safer — 109 is larger than 107 but is less safe!

“Sophie Safe Days”

In honor of Sophie Germain, let us celebrate “Sophie Safe Days” [Note 3] throughout the year, the most recent of which was on 5/23. They serve as good opportunities for us to look back and appreciate the intellectual journey of great mathematicians!

Qingbo Wang is a first-year graduate student in the Bioinformatics and Integrative Genomics PhD program at Harvard University

For more information:

http://www.di-mgt.com.au/rsa_alg.html
Weisstein, Eric W. “RSA Encryption.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/RSAEncryption.html
Weisstein, Eric W. “Pollard p-1 Factorization Method.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/Pollardp-1FactorizationMethod.html
Rivest, Ronald L., Adi Shamir, and Leonard Adleman. “A method for obtaining digital signatures and public-key cryptosystems.” Communications of the ACM 21.2 (1978): 120-126.
Pollard, John M. “Theorems on factorization and primality testing.” Mathematical Proceedings of the Cambridge Philosophical Society. Vol. 76. No. 03. Cambridge University Press, 1974.

(Understanding Fermat’s little theorem and Euler’s Totient Function also might help.)

Supplementary Information

Note 1. Fermat’s last theorem and Sophie Germain’s contribution

Fermat’s last theorem is widely known for its simplicity at first sight:

The equation xⁿ+ yⁿ= zⁿ has no positive integer solutions for n > 2.

If we let n=2, we do know that there are numerous (infinite) solutions that satisfies x²+ y²= z², known as Pythagorean triples (eg. {3,4,5}, {5,12,13}, {8,15,17}). Analogously, a number of mathematicians tried to examine the case of n=3,4,5 and so on. Fermat himself proved the case of n=4 (1640s), whereas the famous Euler did later for the case of n=3.

What is unique about Sophie in her time was that she approached the theorem by considering not a specific number of n (n=3,4 or 5 and so on). Instead, she focused on a particular sub-class of integers: all prime numbers “s” for which (2×s+1) are also primes (this is where the name “Sophie Germain Prime” came from). Here is her theorem:

Let s be a Sophie Germain prime. If the equation x^s+ y^s= z^s has any positive integer solution, either x, y or z needs to be divisible by s.

This represented a large contribution to the proof of the theorem by significantly narrowing down the scope of search.

Note that Fermat’s Last Theorem was completely proved for all n in 1994, more than 150 years after her death.

Note 2. RSA cryptosystem

The key & lock analogy in the main article is a simplified example of a cryptosystem called RSA. Developed by mainly three scientists, Ron Rivest, Adi Shamir, and Leonard Adleman (namely R.S.A.) at MIT in late 1970s, RSA system is widely known as one of the first practical “public key” cryptosystems (cf. in the “secret key” system, a single secret key is used for both encoding and decoding, but for the “public key” system, the key for encoding is open to the public – this corresponds to the “lock” that everyone can use).

The depictions of the RSA cryptosystem and Pollard’s (p-1) algorithm in this article are vastly simplified. In reality, they are quite involved and, even so, they represent only very basic examples in the much more complicated world of cryptography.

Note 3. Sophie Safe Days:

We have mentioned in this article that 5 ( = 2 × 2 + 1) and 23 (= 11 × 2 + 1) are among the smallest Sophie Germain primes. Incidentally, they are also both safe primes (5 × 2 + 1 = 11 and 23 × 2 + 1 = 47, both 11 and 47 are primes, see also Figure 1). For fun, let us define “Sophie Safe Day” as a day where the date and month are both Sophie Germain prime and safe prime. It turns out there are only 6 such Sophie Safe Days per year: 5/5, 5/11, 5/23, 11/5, 11/11, 11/23.

Interestingly, if we focus on the primes involved in 5/23, i.e. 2, 5, 11, 23, 47, we find it to be a consecutive sequence of Sophie germain primes derived from multiplying by 2 and adding 1. This kind of sequence is called Cunningham Chain (of the first kind) of length 5.
Finding long Cunningham Chains has been interesting to mathematicians, and the longest known Cunningham Chain is of length 19.