Numerical methods for matching for teams and Wasserstein barycenters

in collaboration with G. Carlier and A. Oberman



Equilibrium multi-population matching (matching for teams) is a problem from mathematical economics which is related to multi-marginal optimal transport. A special but important case is the Wasserstein barycenter problem, which has applications in image processing and statistics. In our paper Numerical methods for matching for teams and Wasserstein barycenters we introduce two different algorithms: a linear programming algorithm and an efficient nonsmooth optimization algorithm, which applies in the case of the Wasserstein barycenters.

Move your mouse on the pictures to animate (slow connection be patient)

Isobarycenter computation of three and five gaussian measures by a global (first raw) and a localized approach (second raw).

In the discrete setting, the transportation problem is classical. In fact, this problem motivated the historical development of optimization, by Kantorovich in 1939, working on Soviet railway transportation, and in the 1940's by Hitchcock. The assignment problem arises in case of integer values weights, it is a standard combinatorial optimization problem which can be solved by the Hungarian algorithm). More generally, the transportation problem is a linear programming problem which arises when the weights are real-valued, it can be solved by the Hitchcock algorithm, or by modern commerical general linear programming software.
@htmlonly
Isobarycenter of the three textures of the first raw. The pointwise mean of the three textures corresponds to the left picture of the second raw. Wasserstein barycenter is presented on the right picture of the second raw.

Returning to the problem with continuous measures, it is natural to approximate the measure by weighted sums of delta measures. In theory, the resulting problem can be solved using linear programming. However, the number of variables in the linear programming problem is quadratic in the number of variables used to represent the measures. In the discrete setting, current optimization algorithms are limited to approximately several thousands of variables for each of the measures. This problem size corresponds to a fairly coarse approximation of a two dimensional continuous measure. In special cases, or using specific approximations, improvements are available for quadratic costs, and for more references. For example, if each measure is represented by, for example, 1600 variables, the linear program has 2560000 which is near the limitations of linear programming algorithms (we performed experiments using CVX and calling several academic and commercial optimization packages). Enlarging the resolution of the measures quickly overwhelms the capabilities of the algorithms.

The previous problem is even more challenging, since it involves multiple marginals and an additional unknown measure. A reasonable goal is to resolve each of the measures in two dimensions, with (on the order of, say) 1600 variables. Since we wish to allow for measures which are separated, this may require a full grid of size up to 200x200. Resolving the barycenter measure on the full grid generally leads to an intractable problem.

Isobarycenter of the three textures of the first raw. The pointwise mean of the three textures corresponds to the left picture of the second raw. Wasserstein barycenter is presented on the right picture of the second raw.