We propose a novel patch-based image representation that is useful because it (1) inherently detects regions with repetitive structure at multiple scales and (2) yields a parameterless hierarchical segmentation. We describe an image by breaking it into coherent regions where each region is well-described (easily reconstructed) by repeatedly instantiating a patch using a set of simple transformations. In other words, a good segment is one that has sufficient repetition of some pattern, and a patch is useful if it contains a pattern that is repeated in the image.
Our criterion is naturally expressed by the wellestablished minimum description length (MDL) principle. MDL prefers spatially coherent regions with consistent appearance and avoids parameter tuning. We minimize the description length (in bits) of the image by encoding it with patches. Because a patch is itself an image, we measure its description length by applying the same idea recursively: encode a patch by breaking it into regions described by yet simpler patches. The resulting hierarchy of inter-dependent patches naturally leads to a hierarchical segmentation.
We minimize description length over our class of image representations (all patch hierarchies / partitions). We formulate this problem as a recursive multi-label energy. Existing optimization techniques are either inapplicable or get stuck in poor local minima. We propose a new hierarchical fusion (HF) algorithm for energies containing a hierarchy of label costs. Our algorithm is a contribution in itself and should be useful for this new and difficult class of energies.