## Available Options

### Optimization method

Deformetrica offers two optimization methods for the minimization of the criterion in sparseMatching and sparseAtlas:

• gradient-descent: a gradient descent with adaptive stepsize. The initial stepsize is determined by heuristics and then adjusted. In sparseMatching there is only one stepsize. In sparseAtlas, there are 2 stepsizes: one for the deformation parameters (control points and momenta) and another one for the template parameters. In this case, the two stepsizes are adjusted independently.
• Nesterov scheme: it is similar to a gradient descent, except that two variables are used during the iteration, and that the next step is a clever linear combination of the two variables at the previous step. It has a quadratic convergence in terms of number of iterations instead of a linear one, like for the gradient descent.

If a sparsity prior is set different to 0, then we use generalized optimization scheme:

• gradient descent becomes ISTA (Iterative Soft Thresholding Algorithm)
• Nesterov scheme becomes FISTA (Fast Iterative Soft Thresholding Algorithm)

The optimization method is set in the paramDiffeos.xml files. The field use-fista is turned on for the Nesterov/FISTA scheme (value by default) and off for the gradient-descent/ISTA scheme.

References:
[Nesterov 1983] Y. Nesterov, A method of solving a convex programming problem with convergence rate $O (1/k^2)$, Soviet Math. Dokl., 27(2), 1983 (translation by A. Rosa)

[Beck and Teboulle 2009] A. Beck and M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM Journal on Imaging Sciences, 2(1) 183–202, 2009

### Kernel evaluation

The vast majority of the time, the algorithm evaluates functions of the form $\sum_{i=1}^N \sum_{j=1}^M K(x_i,y_j)\alpha_i^T\beta_j$, their gradient or their Hessian matrix. There are three different implementations of such computations:

• exact: the evaluation is made with a double for-loop
• cudaexact: similar to exact with a CUDA implementation in order to increase the performance
• p3m: the evaluation is approximated using Fast Fourier Transforms. If points $x_i$ and $y_j$ are located onto a regular lattice and the multiplication with the kernel K amounts to a convolution, which can be computed with FFTs. This scheme is used by projecting and interpolating points onto a regular lattice. The approximation error is proportional to the squared ratio between the lattice step and the kernel width.
• fgt: the evaluation is approximated using a Fast Gauss Transform. This approximation clusters points $x_i$ and approximate contributions of points in clusters that are far from the evaluation point  by the center of mass of the cluster.

The evaluation method is set in paramDiffeos.xml in kernel-type which could be exact, cudaexact, p3m or fgt.

References:

about p3m method: S. Durrleman, Statistical models of currents for measuring the variability of anatomical curves, surfaces and their evolution, PhD thesis, University of Nice Sophia-Antipolis, 2010 [Chapter 2]

about fgt method: C. Yang, R. Duraiswami, and L. Davis, Efficient Kernel Machines Using the Improved Fast Gauss Transform, In Advances in Neural Information Processing Systems, 17, 1561-1568, 2005.

### Sparsity Priors

Deformetrica offers the opportunity to add a L^1 prior in the cost function to minimize in sparseMatching and sparseAtlas. In sparseMatching the prior is written as $+ \gamma \sum_{k} \|\alpha_{0,k}\|$, where $\alpha_{0,k}$ are the momentum vectors attached to the control points and $\|\|$ denotes the L^2 norm of 2D/3D vectors. In sparseAtlas, the prior is written as $+ \gamma \sum_{i=1}^{N_{su}} \sum_{k} \|\alpha^i_{0,k}\|$.

The value of $\gamma$ is set in paramDiffeos.xml files. If this value is different from 0, the optimization methods ISTA and FISTA are used in place of the gradient descent and the Nesterov scheme. At the end of the algorithm, only control points that carry at least one non-zero momentum vector are returned in the CP_final.txt file.

References:

S. Durrleman, S. Allassonnière, S. Joshi, Sparse Adaptive Parameterization of Variability in Image Ensembles, International Journal of Computer Vision, 101(1), 161-183, 2013