Some Basic and Inefficient Prime Number Generating Algorithms

PGsimple1
In any typical programming language course, the student gets a project to write a program that generates prime numbers. This is considered to be a relatively easy task which is assigned within the first few weeks of the course. As I'm sure you are aware, a few simple and effective algorithms can be used to complete the assignment within as little as a few minutes. In the following examples, I will be using the Python2.5 programming language to demonstrate such algorithms and compare their efficiencies.

The first algorithm we shall consider will begin with the integer 2 and proceed to select each successive integer as a potential prime, (pp), checking for primacy by testing to see if it can be factored by any previously identified primes, then storing each newly verified prime in a prime set (ps) array.

Note: The code given above does not constitute a complete program. See Appendix A for a complete program including a user interface.

Such a rudimentary algorithm takes a strictly brute force approach to effectively achieve the goal of identifying each prime number and storing it in an array. I am sure you would agree that this is also about the least efficient means of generating prime numbers. As we shall see, using elements of sieving processes will increase the efficiency of our program while avoiding the time consuming property of a true Sieve of Erastothenes which selects every consecutive integer as a potential prime before identifying it as factorable and eliminating it as a prime number, much the way PGsimple1 has done.

Runtime Data

These are the best time results taken from 5 test runs at each limit.

This table records the runtimes for pgsimple1 up to each limit indicated. Please accept that these runtimes and all of the runtimes given throughout this document may differ somewhat from those which you may get running the same program on a computer with different hardware or software as mine, which has an AMD Turion 64 1.9 GHz with 2GB RAM, 160GB HDD and Windows Vista.

PGsimple2
A first step at improving this algorithm might be efficiently selecting potential primes. It is most common to see such a device in algorithms which start with the integer 3 and proceed by selecting successive potential primes through the odd integers only. This reduces by half the total number of potential primes which must be tested.

Now, brute force has been augmented by some simple logic to significantly improve efficiency, reducing the number of potential primes by half.

Runtime Data

These are the best time results taken from 5 test runs at each limit.

This table records the runtimes for pgsimple2 and how many times faster it completes the run up to each limit compared to pgsimple1. Note that the efficiency remains close to double that of pgsimple1 at any limit. Even at this speed, it is still quite impractical to generate 8 digit primes or more. But, I did it just to see how long it would take.

PGsimple3
The next most obvious improvement would probably be limiting the testing process to only checking if the potential prime can be factored by those primes less than or equal to the square root of the potential prime, since primes larger than the square root of the potential prime will be complementary factors of at least one prime less than the square root of the potential prime.

This algorithm makes truly significant strides in efficiency and, at this point, most programmers have exhausted their ability or desire to continue improving the efficiency of the prime number generator, but we shall go on.

Runtime Data

These are the best time results taken from 5 test runs at each limit.

This table records the runtimes for pgsimple3 and how many times faster it completes the run up to each limit compared to pgsimple1 and pgsimple2. Note that the longer the program is run, the more significant the efficiency becomes.

PGsimple4
Recognizing that by using a skip number of 2 to select only odd potential primes, it is no longer necessary to test potential primes against all the primes less than the square root of the prime, as none of them can be factored by 2. Therefore, we can remove the first prime number from the set of primes which we test potential primes against. This requires dividing the prime set (ps) array into excepted prime (ep) and test prime (tp) arrays, then recombining them at the end to send the complete set back to the function call.

In the next version it will be shown why we put the skip number (2) into a skip set (ss) array.

Runtime Data

These are the best time results taken from 5 test runs at each limit.

This table records the runtimes for pgsimple4 and how many times faster it completes the run up to each limit compared to pgsimple3.

What improvement in efficiency? Note that there is only a marginal increase in efficiency compared to pgsimple3. Worry not, increases in efficiency multiply as more primes are eliminated from the testing process in the more advanced version of the program which I will show you next.

PG7.8
This algorithm efficiently selects potential primes by eliminating multiples of previously identified primes from consideration and minimizes the number of tests which must be performed to verify the primacy of each potential prime. While the efficiency of selecting potential primes allows the program to sift through a greater range of numbers per second the longer the program is run, the number of tests which need to be performed on each potential prime does continue to rise, (but rises at a slower rate compared to other algorithms). Together, these processes bring greater efficiency to generating prime numbers, making the generation of even 10 digit verified primes possible within a reasonable amount of time on a PC.

Further skip sets can be developed to eliminate the selection of potential primes which can be factored by each prime that has already been identified. Although this process is more complex, it can be generalized and made somewhat elegant. At the same time, we can continue to eliminate from the set of test primes each of the primes which the skip sets eliminate multiples of, minimizing the number of tests which must be performed on each potential prime. This example is fully commented, line by line, with some explanation to help the reader fully comprehend how the algorithm works. A complete program including a user interface, but without the comments, can be found in Appendix B.

Please disregard syntactical errors which occur in the user interface such as “the 1th prime”, instead of “1st”, and the inclusion of the last prime generated in the completed array even though it may be larger than the user defined limit. These errors can easily be corrected at the convenience of the student programmer, but were not necessary to illustrate the performance of the algorithms. I apologize for any confusion or inconvenience this may have caused the reader.

Runtime Data

These are the best time results taken from 5 test runs at each limit.

This table records the runtimes for pg7.8 and how many times faster it completes the run up to each limit compared to pgsimple1 and pgsimple4. Note that again the longer the program is run, the more significant the efficiency becomes.

A note from the author
Thank you for taking the time to study the algorithm and I hope that it has inspired you. If you choose to translate this algorithm into another programming language, please email a copy of your work to cfo@mfbs.org.

A note from the prime-number-best-algorithm owner
Well, all steps are ok, but you can stop at the very beginning; as long as you can state that all 2N with N>1 are not primes,  you can also state that all primes except 3 are not in the form 3N and so on. It leads at first step to the known primes form of 6N±1 (that is elegant but wrong, the really form is 6N+1 or 6N+5) but you can do better, as long you use 30N+1, 30N+7, 30N+11 ..... 30N+29 or even better with 210N+1, 210N+11, 210N+13 ...... 210N+209. In short, use an algorithm to reduce the "search field" in a smart way, using 1*1, 1*2, 1*2*3, 1*2*3*5, 1*2*3*5*7 .... as "base" and a list of "displacements" to add.

But note that the speed is still compromised by the presence of multiple, always growing, "sqrt" and "div" (or better "mod") operations. The sqrt operations can be "reversed", simply doing a N^2 that limits validity of your tests : the benefit in time-terms is great! Also "module" ops can be avoided, but the trick is a little bit long to be wrote, will do sometime in future

Therefore, there is no need to calculate primes as long as you can guess them and are able to distinguish between "real" and "fake" ones.

If you had studied the algorithm, you would realize that it does reduce the "search field" in a smart way through the generation of skip sets that filter 6N, 30N, 210N, etc. ad infinitum with each generation using the known primes as the base, as you suggested. Even so, numbers that have only factors which are larger than those base primes of the skip sets must be filtered by testing against primes greater than the filter primes, but less than the sqrt of the number being tested. Hence the need to generate at least all the primes up to the square root of the largest number to be tested. i.e. the only way to distinguish between "real" and "fake" ones.