1
2.. default-domain:: cpp
3
4.. cpp:namespace:: ceres
5
6.. _chapter-nnls_covariance:
7
8=====================
9Covariance Estimation
10=====================
11
12Introduction
13============
14
15One way to assess the quality of the solution returned by a non-linear
16least squares solver is to analyze the covariance of the solution.
17
18Let us consider the non-linear regression problem
19
20.. math::  y = f(x) + N(0, I)
21
22i.e., the observation :math:`y` is a random non-linear function of the
23independent variable :math:`x` with mean :math:`f(x)` and identity
24covariance. Then the maximum likelihood estimate of :math:`x` given
25observations :math:`y` is the solution to the non-linear least squares
26problem:
27
28.. math:: x^* = \arg \min_x \|f(x) - y\|^2
29
30And the covariance of :math:`x^*` is given by
31
32.. math:: C(x^*) = \left(J'(x^*)J(x^*)\right)^{-1}
33
34Here :math:`J(x^*)` is the Jacobian of :math:`f` at :math:`x^*`. The
35above formula assumes that :math:`J(x^*)` has full column rank.
36
37If :math:`J(x^*)` is rank deficient, then the covariance matrix :math:`C(x^*)`
38is also rank deficient and is given by the Moore-Penrose pseudo inverse.
39
40.. math:: C(x^*) =  \left(J'(x^*)J(x^*)\right)^{\dagger}
41
42Note that in the above, we assumed that the covariance matrix for
43:math:`y` was identity. This is an important assumption. If this is
44not the case and we have
45
46.. math:: y = f(x) + N(0, S)
47
48Where :math:`S` is a positive semi-definite matrix denoting the
49covariance of :math:`y`, then the maximum likelihood problem to be
50solved is
51
52.. math:: x^* = \arg \min_x f'(x) S^{-1} f(x)
53
54and the corresponding covariance estimate of :math:`x^*` is given by
55
56.. math:: C(x^*) = \left(J'(x^*) S^{-1} J(x^*)\right)^{-1}
57
58So, if it is the case that the observations being fitted to have a
59covariance matrix not equal to identity, then it is the user's
60responsibility that the corresponding cost functions are correctly
61scaled, e.g. in the above case the cost function for this problem
62should evaluate :math:`S^{-1/2} f(x)` instead of just :math:`f(x)`,
63where :math:`S^{-1/2}` is the inverse square root of the covariance
64matrix :math:`S`.
65
66Gauge Invariance
67================
68
69In structure from motion (3D reconstruction) problems, the
70reconstruction is ambiguous up to a similarity transform. This is
71known as a *Gauge Ambiguity*. Handling Gauges correctly requires the
72use of SVD or custom inversion algorithms. For small problems the
73user can use the dense algorithm. For more details see the work of
74Kanatani & Morris [KanataniMorris]_.
75
76
77:class:`Covariance`
78===================
79
80:class:`Covariance` allows the user to evaluate the covariance for a
81non-linear least squares problem and provides random access to its
82blocks. The computation assumes that the cost functions compute
83residuals such that their covariance is identity.
84
85Since the computation of the covariance matrix requires computing the
86inverse of a potentially large matrix, this can involve a rather large
87amount of time and memory. However, it is usually the case that the
88user is only interested in a small part of the covariance
89matrix. Quite often just the block diagonal. :class:`Covariance`
90allows the user to specify the parts of the covariance matrix that she
91is interested in and then uses this information to only compute and
92store those parts of the covariance matrix.
93
94Rank of the Jacobian
95====================
96
97As we noted above, if the Jacobian is rank deficient, then the inverse
98of :math:`J'J` is not defined and instead a pseudo inverse needs to be
99computed.
100
101The rank deficiency in :math:`J` can be *structural* -- columns
102which are always known to be zero or *numerical* -- depending on the
103exact values in the Jacobian.
104
105Structural rank deficiency occurs when the problem contains parameter
106blocks that are constant. This class correctly handles structural rank
107deficiency like that.
108
109Numerical rank deficiency, where the rank of the matrix cannot be
110predicted by its sparsity structure and requires looking at its
111numerical values is more complicated. Here again there are two
112cases.
113
114  a. The rank deficiency arises from overparameterization. e.g., a
115     four dimensional quaternion used to parameterize :math:`SO(3)`,
116     which is a three dimensional manifold. In cases like this, the
117     user should use an appropriate
118     :class:`LocalParameterization`. Not only will this lead to better
119     numerical behaviour of the Solver, it will also expose the rank
120     deficiency to the :class:`Covariance` object so that it can
121     handle it correctly.
122
123  b. More general numerical rank deficiency in the Jacobian requires
124     the computation of the so called Singular Value Decomposition
125     (SVD) of :math:`J'J`. We do not know how to do this for large
126     sparse matrices efficiently. For small and moderate sized
127     problems this is done using dense linear algebra.
128
129
130:class:`Covariance::Options`
131
132.. class:: Covariance::Options
133
134.. member:: int Covariance::Options::num_threads
135
136   Default: ``1``
137
138   Number of threads to be used for evaluating the Jacobian and
139   estimation of covariance.
140
141.. member:: SparseLinearAlgebraLibraryType Covariance::Options::sparse_linear_algebra_library_type
142
143   Default: ``SUITE_SPARSE`` Ceres Solver is built with support for
144   `SuiteSparse <http://faculty.cse.tamu.edu/davis/suitesparse.html>`_
145   and ``EIGEN_SPARSE`` otherwise. Note that ``EIGEN_SPARSE`` is
146   always available.
147
148.. member:: CovarianceAlgorithmType Covariance::Options::algorithm_type
149
150   Default: ``SPARSE_QR``
151
152   Ceres supports two different algorithms for covariance estimation,
153   which represent different tradeoffs in speed, accuracy and
154   reliability.
155
156   1. ``SPARSE_QR`` uses the sparse QR factorization algorithm to
157      compute the decomposition
158
159       .. math::
160
161          QR &= J\\
162          \left(J^\top J\right)^{-1} &= \left(R^\top R\right)^{-1}
163
164      The speed of this algorithm depends on the sparse linear algebra
165      library being used. ``Eigen``'s sparse QR factorization is a
166      moderately fast algorithm suitable for small to medium sized
167      matrices. For best performance we recommend using
168      ``SuiteSparseQR`` which is enabled by setting
169      :member:`Covaraince::Options::sparse_linear_algebra_library_type`
170      to ``SUITE_SPARSE``.
171
172      ``SPARSE_QR`` cannot compute the covariance if the
173      Jacobian is rank deficient.
174
175
176   2. ``DENSE_SVD`` uses ``Eigen``'s ``JacobiSVD`` to perform the
177      computations. It computes the singular value decomposition
178
179      .. math::   U D V^\top = J
180
181      and then uses it to compute the pseudo inverse of J'J as
182
183      .. math::   (J'J)^{\dagger} = V  D^{2\dagger}  V^\top
184
185      It is an accurate but slow method and should only be used for
186      small to moderate sized problems. It can handle full-rank as
187      well as rank deficient Jacobians.
188
189
190.. member:: int Covariance::Options::min_reciprocal_condition_number
191
192   Default: :math:`10^{-14}`
193
194   If the Jacobian matrix is near singular, then inverting :math:`J'J`
195   will result in unreliable results, e.g, if
196
197   .. math::
198
199     J = \begin{bmatrix}
200         1.0& 1.0 \\
201         1.0& 1.0000001
202         \end{bmatrix}
203
204   which is essentially a rank deficient matrix, we have
205
206   .. math::
207
208     (J'J)^{-1} = \begin{bmatrix}
209                  2.0471e+14&  -2.0471e+14 \\
210                  -2.0471e+14&   2.0471e+14
211                  \end{bmatrix}
212
213
214   This is not a useful result. Therefore, by default
215   :func:`Covariance::Compute` will return ``false`` if a rank
216   deficient Jacobian is encountered. How rank deficiency is detected
217   depends on the algorithm being used.
218
219   1. ``DENSE_SVD``
220
221      .. math:: \frac{\sigma_{\text{min}}}{\sigma_{\text{max}}}  < \sqrt{\text{min_reciprocal_condition_number}}
222
223      where :math:`\sigma_{\text{min}}` and
224      :math:`\sigma_{\text{max}}` are the minimum and maxiumum
225      singular values of :math:`J` respectively.
226
227   2. ``SPARSE_QR``
228
229       .. math:: \operatorname{rank}(J) < \operatorname{num\_col}(J)
230
231       Here :math:`\operatorname{rank}(J)` is the estimate of the rank
232       of :math:`J` returned by the sparse QR factorization
233       algorithm. It is a fairly reliable indication of rank
234       deficiency.
235
236.. member:: int Covariance::Options::null_space_rank
237
238    When using ``DENSE_SVD``, the user has more control in dealing
239    with singular and near singular covariance matrices.
240
241    As mentioned above, when the covariance matrix is near singular,
242    instead of computing the inverse of :math:`J'J`, the Moore-Penrose
243    pseudoinverse of :math:`J'J` should be computed.
244
245    If :math:`J'J` has the eigen decomposition :math:`(\lambda_i,
246    e_i)`, where :math:`\lambda_i` is the :math:`i^\textrm{th}`
247    eigenvalue and :math:`e_i` is the corresponding eigenvector, then
248    the inverse of :math:`J'J` is
249
250    .. math:: (J'J)^{-1} = \sum_i \frac{1}{\lambda_i} e_i e_i'
251
252    and computing the pseudo inverse involves dropping terms from this
253    sum that correspond to small eigenvalues.
254
255    How terms are dropped is controlled by
256    `min_reciprocal_condition_number` and `null_space_rank`.
257
258    If `null_space_rank` is non-negative, then the smallest
259    `null_space_rank` eigenvalue/eigenvectors are dropped irrespective
260    of the magnitude of :math:`\lambda_i`. If the ratio of the
261    smallest non-zero eigenvalue to the largest eigenvalue in the
262    truncated matrix is still below min_reciprocal_condition_number,
263    then the `Covariance::Compute()` will fail and return `false`.
264
265    Setting `null_space_rank = -1` drops all terms for which
266
267    .. math::  \frac{\lambda_i}{\lambda_{\textrm{max}}} < \textrm{min_reciprocal_condition_number}
268
269    This option has no effect on ``SPARSE_QR``.
270
271.. member:: bool Covariance::Options::apply_loss_function
272
273   Default: `true`
274
275   Even though the residual blocks in the problem may contain loss
276   functions, setting ``apply_loss_function`` to false will turn off
277   the application of the loss function to the output of the cost
278   function and in turn its effect on the covariance.
279
280.. class:: Covariance
281
282   :class:`Covariance::Options` as the name implies is used to control
283   the covariance estimation algorithm. Covariance estimation is a
284   complicated and numerically sensitive procedure. Please read the
285   entire documentation for :class:`Covariance::Options` before using
286   :class:`Covariance`.
287
288.. function:: bool Covariance::Compute(const vector<pair<const double*, const double*> >& covariance_blocks, Problem* problem)
289
290   Compute a part of the covariance matrix.
291
292   The vector ``covariance_blocks``, indexes into the covariance
293   matrix block-wise using pairs of parameter blocks. This allows the
294   covariance estimation algorithm to only compute and store these
295   blocks.
296
297   Since the covariance matrix is symmetric, if the user passes
298   ``<block1, block2>``, then ``GetCovarianceBlock`` can be called with
299   ``block1``, ``block2`` as well as ``block2``, ``block1``.
300
301   ``covariance_blocks`` cannot contain duplicates. Bad things will
302   happen if they do.
303
304   Note that the list of ``covariance_blocks`` is only used to
305   determine what parts of the covariance matrix are computed. The
306   full Jacobian is used to do the computation, i.e. they do not have
307   an impact on what part of the Jacobian is used for computation.
308
309   The return value indicates the success or failure of the covariance
310   computation. Please see the documentation for
311   :class:`Covariance::Options` for more on the conditions under which
312   this function returns ``false``.
313
314.. function:: bool GetCovarianceBlock(const double* parameter_block1, const double* parameter_block2, double* covariance_block) const
315
316   Return the block of the cross-covariance matrix corresponding to
317   ``parameter_block1`` and ``parameter_block2``.
318
319   Compute must be called before the first call to ``GetCovarianceBlock``
320   and the pair ``<parameter_block1, parameter_block2>`` OR the pair
321   ``<parameter_block2, parameter_block1>`` must have been present in the
322   vector covariance_blocks when ``Compute`` was called. Otherwise
323   ``GetCovarianceBlock`` will return false.
324
325   ``covariance_block`` must point to a memory location that can store
326   a ``parameter_block1_size x parameter_block2_size`` matrix. The
327   returned covariance will be a row-major matrix.
328
329.. function:: bool GetCovarianceBlockInTangentSpace(const double* parameter_block1, const double* parameter_block2, double* covariance_block) const
330
331   Return the block of the cross-covariance matrix corresponding to
332   ``parameter_block1`` and ``parameter_block2``.
333   Returns cross-covariance in the tangent space if a local
334   parameterization is associated with either parameter block;
335   else returns cross-covariance in the ambient space.
336
337   Compute must be called before the first call to ``GetCovarianceBlock``
338   and the pair ``<parameter_block1, parameter_block2>`` OR the pair
339   ``<parameter_block2, parameter_block1>`` must have been present in the
340   vector covariance_blocks when ``Compute`` was called. Otherwise
341   ``GetCovarianceBlock`` will return false.
342
343   ``covariance_block`` must point to a memory location that can store
344   a ``parameter_block1_local_size x parameter_block2_local_size`` matrix. The
345   returned covariance will be a row-major matrix.
346
347Example Usage
348=============
349
350.. code-block:: c++
351
352 double x[3];
353 double y[2];
354
355 Problem problem;
356 problem.AddParameterBlock(x, 3);
357 problem.AddParameterBlock(y, 2);
358 <Build Problem>
359 <Solve Problem>
360
361 Covariance::Options options;
362 Covariance covariance(options);
363
364 vector<pair<const double*, const double*> > covariance_blocks;
365 covariance_blocks.push_back(make_pair(x, x));
366 covariance_blocks.push_back(make_pair(y, y));
367 covariance_blocks.push_back(make_pair(x, y));
368
369 CHECK(covariance.Compute(covariance_blocks, &problem));
370
371 double covariance_xx[3 * 3];
372 double covariance_yy[2 * 2];
373 double covariance_xy[3 * 2];
374 covariance.GetCovarianceBlock(x, x, covariance_xx)
375 covariance.GetCovarianceBlock(y, y, covariance_yy)
376 covariance.GetCovarianceBlock(x, y, covariance_xy)
377