Computer vision is full of problems elegantly expressed in terms of energy minimization. We characterize a class of energies with hierarchical costs and propose a novel hierarchical fusion algorithm. Hierarchical costs are natural for modeling an array of difficult problems. For example, in semantic segmentation one could rule out unlikely object combinations via hierarchical context. In geometric model estimation, one could penalize the number of unique model families in a solution, not just the number of models — a kind of hierarchical MDL criterion. Hierarchical fusion uses the well-known a-expansion algorithm as a subroutine, and offers a much better approximation bound in important cases.