This repository contains recursion-based implementation of Hilbert Curve Sorting algorithms for 2-tuples and 3-tuples.
Sorting, by intuition, gives an ordering of a set where the distance between neighboring elements is minimized. It is heavily used for DB indexing, visualization and a myriad of other purposes.
The distance may be measured in Hamming Distance for bits, or Euclidean 2-norm for vectors.
Sometimes our data is multi-dimensional,
like sRGB colors {
The simplest sort one might use is the Lex Order, which compares the first non-identical component of tuples, but this approach has many flaws.
Just take
$[00, 01, 10, 11]$ as an example, Lex Order will return the original sequence, which fails to minimize element-wise Hamming Distance.
A better ordering is$[00, 01, 11, 10]$ , which is the Gray Code sequence$G_2$ .
Mapping from
Technically the Hilbert Curves maps the unit interval
$[0, 1]$ to the unit n-hypercube$U_n$ , but the hypercube could be offsetted and scaled to fit our data.
Neighboring tuples in the input space will also end up close to each other on the Hilbert Curve,
preserving the spatial relationship between the input points.
Some key properties of the Hilbert Curves that allows us to implement a recursive algorithm is, that they are locality-preserving space-filling curves, and that they are fractals. Details below.
-
Locality-preserving - Points on the Hilbert Curve converges to a definite point as iteration count increases.
This is in general not true for other curves, where the resulting points might shift vastly over iterations.
-
Fractalness - Every iteration
$\mathbb{H}(n)$ implicitly traces (scaled)$\mathbb{H}(n-k), k \le n$ curves.
H_2 Overlay, First 3 Iterations ~ Fractalness
Images by Geoff Richards (Qef) - Own work, Public Domain
Now, take the
Divide
The orientation of
$n$ -d Hilbert Curve is based off the Gray Code sequence$G_n$ , interpreting the bits as mentioned above, generalized.
This is intuitive as the Gray Code is a permutation of the Bit Code, which contains every vertex of$U_n$ , and only one bit (axis) is flipped (shifted) between successive Gray Code bits.
This allows us to divide an arbitrary number of points into quadrants and sort the quadrants instead, after which we concatenate the results in
The recursion base will be when there are only 1 or none vector left, or if all vectors are identical. The input vectors will be returned.
The recursion step will offset and scale the vectors to fit a hypercube, run recursion, and return de-fitted result vectors.
The
An interactive demo GitHub Page. It contains demos of:
-
2-D points sorting - connects set / random 2-D points according to the sorted sequence.
The result from sorting random points will roughly estimate a hilbert curve.
-
Colors (sRGB) sorting - sorting sRGB colors, using
$(R, G, B)$ as the color vector, with a pre/post-sort comparison.The result would be more gradient-like (overall) than the raw version.
HSL might give better results (?), you could experiment with it freely, through minor changes to the assets/demo.html script source code.
The time complexity of the algorithms is estimated to be
Space complexity of the algorithms is expected to be high, as recursion is heavily used.
Overall performance should be pretty good, as a toy. But since
Here the complexities are estimated assuming uniform distribution of points in the hypercube (spacing 1), so performance shall not be a problem in most cases.
Due to the heavy use of recursion, asynchronous operations are used to spread computational load.
Generalization to higher dimensions is possible but much more complicated.
Here I only implemented the
The algorithms could be used for fast approximations of the Travelling Salesman Problem. A test was ran on pla85900.tsp that contains 85900 nodes. The final distance 188465250 (CEIL_2D) is only a rough x1.32 of the Mathematically optimal distance 142382641 (CEIL_2D). That might seem quite bad, but many dedicated approximation algorithms do no better, yet this algorithm is more efficient. (details)