Freaky Prime Numbers Discovery

So I bumped into this article from Gizmodo today. It talks about how there is a bias in the last digit in consecutive prime numbers, or more generally the non-uniformity of the distribution of consecutive primes modulo any number. The referenced journal article mentions distributions for primes ending in 1 following primes ending in 1, primes ending in 3 following primes ending in 1, and so on for the pairs (7, 1), (9, 1); (1, 3), (3, 3), (7, 3), (9, 3); (1, 7), (3, 7), (7, 7), (9, 7); (1, 9), (3, 9), (7, 9), (9, 9). Unfortunately, it doesn’t mention how the ratio of these groups over the total amount of pairs varies with the progression of the sequence of primes; unfortunately, there is no such graph either in the Gizmodo article, or in the journal paper.

This disappointment led me to make a crude plot of my own, after verifying that I get the same distribution for the first 100M primes as mentioned in the journal paper (in fact, some of my counts are off by 1, in particular being:

1st prime 2nd prime amount
1 1 4623041 (-1)
1 3 7429438 (same)
1 7 7504612 (same)
1 9 5442344 (-1)
3 1 6010981 (-1)
3 3 4442561 (-1)
3 7 7043695 (same)
3 9 7502896 (same)
7 1 6373982 (+1)
7 3 6755195 (same)
7 7 4439355 (same)
7 9 7431870 (same)
9 1 7991431 (same)
9 3 6372940 (-1)
9 7 6012739 (same)
9 9 4622916 (same)

I don’t know why this is the case, but it is virtually the same result. I’m open to suggestions.) Here is a plot of the evolution of the distributions for the first 10M primes, in 100K increments (100 steps) and for the first 100M primes, in 1M increments (100 steps):


There are a few things that are immediately evident from these crude plots, which are not so much evident from the table above:

  1. The distributions of (1, 1) and (9, 9) are very close, and so are these of (3, 3) and (7, 7);
  2. The same goes for (1, 3) and (7, 9), as well as (3, 1) and (9, 7), which seems even more interesting;
  3. The same goes for (1, 7) and (3, 9), as well as (7, 1) and (9, 3), a real pattern, perhaps?
  4. The next pairs would be (1, 9) and (1, 9), but that’s the same one, so nothing non-trivial here; the last pairs are (3, 7) and (7, 3) which don’t share their distribution with any other pairs;

Most importantly, it is clear from the graphs that the distributions approach the average of 1/16, so it leaves one wondering whether for large enough maximum prime the ratios would indeed become equal.

Admittedly, my analysis here is more reminiscent of numerology than of mathematics.And who knows, maybe there are simple mathematical reasons for my observations, but I am lazy to do the math. Still, this post illustrates one important truth: it takes far less time to get an intuition about a result when it’s presented graphically, rather than in pages upon pages of equations, words and numbers. Simply put: images make research more accessible.

SVN revisions and Git commits corresponding to LLVM Releases

After looking for the following information and not being able to find it, I gathered it myself and organised it in a table.

The SVN repository is located at and the Git repository is


Release Date SVN revision Git commit
3.6.1 26 May 2015 237753 95d08bce8
3.6.0 27 Feb 2015 230432 2013f8f1b2
3.5.2 02 Apr 2015 233140 c50bf46a22
3.5.1 20 Jan 2015 225662 d275e025d2
3.5.0 03 Sep 2014 216955 a2e22cd4cd
3.4.2 19 Jun 2014 213524 6816d66d99
3.4.1 07 May 2014 208030 f3a199b2ae
3.4.0 02 Jan 2014 197954 e97b13228a
3.3 17 Jun 2013 183500 f4a66d2005
3.2 20 Dec 2012 170557 91223a41ef
3.1 22 May 2012 156746 dc23c1463a
3.0 01 Dec 2011 145347 ba78c883d4
2.9 06 Apr 2011 129055 a844a3e7b2
2.8 05 Oct 2010 115865 fa7fb64fad
2.7 27 Apr 2010 102412 fc0b860bcc
2.6 23 Oct 2009 84988 66284e063a
2.5 02 Mar 2009 65925 1c08c0e128

After-tutorial clarifications

At today’s functional programming tutorial I couldn’t answer (at least) three questions, so here is what I found out after checking for solutions:

1) Ewen’s solution to unequals:

unequals :: [(Int, Int)] -> [(Int, Int)]
unequals [x, y] = filter (/=y) [x, y]

This seemed very confusing to me, as it looks like the intention is to match all of the first elements of the pairs to x, and all of the second elements to y. But then (/=y) would compare a number to a list and it all got a little bit too confusing. However, when I tried it out, it worked! What is going on here?

*Main> unequals [(1,2), (3,3)]

I pondered for a bit and then laughed at myself, while trying to send three arguments to the function:

*Main> unequals [(1,2), (3,3), (4,5)]
*** Exception: temp.hs:27:1-37: Non-exhaustive patterns in function unequals

Then it suddenly became obvious what our implementation means:
[x, y] matches lists of two elements, calling the first one x, and the second one y. Then filter (/=y) [x, y] will always remove the second element from this list, which by chance happened to be the one that we didn’t want anyway (3,3). So yes, the function works in this particular case, but not in general🙂 As we discussed, a correct solution would be

unequals xs = filter helper xs
    where helper (a, b) = a /= b

2) Felix asked me about debugging in ghci. His question is motivated by having something like this:

inverse m = scaleMatrix (1 / determinant m) (transpose $ cofactors m)

which you are not sure if it works and suspect that it doesn’t. It is cumbersome to split it to smaller expressions and test each of them out. Luckily, it turns out there is debugging in ghci, and this page explains how to use it:

3) $ and .
I shamefully admitted that I don’t know what $ means exactly but that it can be used to remove parenthesis, while . is function composition. A good explanation about their difference can be found here:

Ask questions.

From zero to cross-compiling for pandaboard

ARM NEON general-purpose SIMD engine

I spent the last two weeks trying to set up a development environment for myself in which I could tinker with a compiler targeting an ARM processor with NEON capabilities. The process turned out to be more complicated than I imagined, so I am documenting my findings with the hope of helping out someone else trying to do the same thing (most probably my future self).

The existing compiler

LLVM Compiler Suite

Choosing an open source compiler to use as a base was easy. I have worked on LLVM before and its code is easier to maintain and modify than that of the other big candidate: GCC.

An important thing to note is that if one chooses to work on LLVM, they should get the latest official release or the current development code. The compiler as a program is complicated enough, and the additional challenge of building an outdated monster of a program is unneeded. However, I did this mistake trying to build v2.9 while the current development version was 3.4. After a couple of days I gave up.

Also one should really use the help resources that are available and ask questions early enough. The mailing list for LLVM developers is a very useful one.

A plain vanilla build of LLVM should be enough for a taste of cross-compilation. In order to obtain one you would need this guide and patience. A typical build on my machine takes about an hour. And it is not a tiny machine…

For a quick cheat sheet here are the commands to get the latest build:

> svn co llvm
> cd llvm/tools
> svn co clang
> cd ../..
> mkdir build
> cd build
> ../llvm/configure --enable-optimized=YES --enable-assertions=NO
> make -j<num-of-cores> e.g make -j4

Getting the ARM resources

Once you have a working LLVM compiler, you would need the standard libraries compiled for ARM so that you can link against them. Since the pandaboard has a Cortex-A9 processor on it, we need the ‘armihf’ versions of the libraries (I guess ‘hf’ comes from hard float). From this guide it turns out we need (at least) the following libraries.

  • gcc-4.7-arm-linux-gnueabihf
  • gcc-4.7-multilib-arm-linux-gnueabihf
  • binutils-arm-linux-gnueabihf
  • libgcc1-armhf-cross
  • libsfgcc1-armhf-cross
  • libstdc++6-armhf-cross
  • libstdc++6-4.7-dev-armhf-cross

You should be able to install them via the standard package manager that you’re using.

Compiling an example

Now that you have a compiler and the necessary libraries, let’s try to compile something to see if it works.

Here is a complete example program, build on the code from this post It uses a few of the NEON intrinsics so it is a good test for our toolchain.

Once you have copied the code in a file (say rgb.c) you can compile it with the following command:

> clang -target armv7a-linux-gnueabihf -mcpu=cortex-a9 -I/usr/arm-linux-gnueabihf/include/c++/4.7.3/arm-linux-gnueabihf/ -I/usr/arm-linux-gnueabihf/include -mfloat-abi=hard -ccc-gcc-name arm-linux-gnueabihf-gcc -O2 rgb.c -o rgb

The options should be self-explanatory, but they weren’t for me, so just in case:

  • -target armv7a-linux-gnueabihf: set the correct target architecture.
  • -mcpu=cortex-a9: set the correct target processor. See llc -mtriple armv7a-linux-gnueabihf -mcpu=help for all the possibilities.
  • -I/usr/arm-linux-gnueabihf/include/c++/4.7.3/arm-linux-gnueabihf/: point the compiler to where the standard C++ includes are
  • -I/usr/arm-linux-gnueabihf/include: point the compiler to where the standard C includes are
  • -mfloat-abi=hard: there is hardware floating point unit on the cortex-a9
  • -ccc-gcc-name arm-linux-gnueabihf-gcc: specify the variant of gcc to use
  • -O2: level 2 optimizations
  • -o rgb: write the output to the file named “rgb”

Pandaboard’s Panda

After the compilation is done copy the executable to your pandaboard and run it. If it produces the following output, then you’re done!