User:Gkhan/Closest pair of points

Let's say that we have a number of points in a 2-dimensional plane with x-coordinates stored in array x and y-coordinates in the array y. How would one go about solving this problem? The obvious way is to compare all the points to all of the others. How would such an algorithm look? Something like this (it returns a 3-tuple with the two points and the distance between them):

function ClosestPair(x[], y[], numPoints) min := $$\infty$$ point1 := 0 point2 := 0 for i := 1 to numPoints-1 for j := i + 1 to numPoints if sqrt((x[i] - x[j])^2 + (y[i] - y[j])^2) < min: min := sqrt((x[i] - x[j])^2 + (y[i] - y[j])^2) point1 := i point2 := j fi repeat repeat return {p1, p2, min} end function

What is the running time of this algorithm. Since the number of iterations in the inner loop starts with p-1 and decresases by one each time until it reaches 1, in which case the main loop finishes. It is thus an arithmetic progression from one to p-1:
 * $$1+2+3+4+\ldots+(p-2)+(p-1) = {p(p-1) \over 2}$$

It is therfore an $$O(n^2)$$ algorithm were n is the number of points.

Can we do better? Yes we can! First what we do is divide the group of points into two groups of size $$\mathcal{b}{n / 2}\mathcal{c}$$ and $$\mathcal{d}{n / 2}\mathcal{e}$$ by drawing a line through the point number $$\mathcal{b}{n / 2}\mathcal{c}$$ counted from the left. ($$\mathcal{b}x\mathcal{c}$$ here represents the floor function and $$\mathcal{d}x\mathcal{e}$$ is the ceiling function). We then perform the perform the algorithm recursivly for the two groups. If the group size is 1 we return $$\infty$$ or if the size is 2 we return the distance between the two points.

We can now start making creating some code. Note that the points in x[] and y[] are sorted after their x-coordinate primarily and y-coordinate secondarily (i.e if their x-coordinate is the same).

function ClosestPair(x[], y[], startIndex, finishIndex) numPoints := finishIndex - startIndex + 1 min = $$\infty$$ if numPoints == 1 return $$\infty$$ else if numPoints == 2 return sqrt((x[startIndex] - x[finishIndex])^2 + (y[startIndex] - y[finishIndex])^2) else min = Min(ClosestPair(x[], y[], startIndex, startIndex + Floor(numPoints/2)),               ClosestPair(x[], y[], startIndex + Ceiling(numPoints/2)), finishIndex))   fi   ......... end function

This function obviously doesn't give the right results since none of the points in group 1 is compared to the points in group 2. What if the two closest points lie just left and right of the dividing line? We obviosly have to compare the groups with eachother.

Let's say that the distance between the closest pair of points in the groups is k (the variable min in the code) then if there exists a closer pair in different groups they must lie in a 2k wide strip over the dividing line.