The CDF(2022) formulation (Ceccherini et al., 2022) is the recommended implementation of the Complete Data Fusion algorithm. Unlike the original CDF(2015), it avoids the inversion of the noise covariance matrices Sni, which are often singular, replacing them with the always-invertible total error covariance matrices Si.
The three formulations below correspond to progressively more general measurement scenarios.
Configuration A — Perfect coincidence, common vertical grid
This is the simplest case: N measurements are perfectly co-located in space and time, and all retrieved on the same vertical grid. No interpolation or coincidence errors are needed.
The fused profile is:
where \mathbf{\alpha}_{i} = \widehat{\mathbf{x}}_{i} – \mathbf{x}_{ai} + \mathbf{A}_{i}\mathbf{x}_{ai}, and \mathbf{x}_{a}, \mathbf{S}_{a} are the a priori profile and covariance matrix used to constrain the fused product.
The averaging kernel matrix (AKM) and the covariance matrices (CMs) of the fused profile are:
Note that \mathbf{S}_{i} = \mathbf{S}_{ni} + \mathbf{S}_{si} is the total error CM of the i-th product, which is always invertible.
Configuration B — Different vertical grids, perfect coincidence
The N measurements are co-located in space and time but retrieved on different vertical grids, so vertical interpolation is required. Since the measurements are perfectly coincident, the coincidence error term \mathbf{S}_{\text{coin}} is zero.
Let \mathbf{H}_{i} be the interpolation matrix that maps profiles from the i-th retrieval grid to the fusion grid, and \mathbf{R}_{i} its generalised inverse. Let \mathbf{C}^{(i)} and \mathbf{C}^{(f)} be the sampling matrices from a common fine grid to the i-th retrieval grid and to the fusion grid, respectively.
The modified total error CM for each product is:
where \mathbf{S}_{\text{a,fine}} is the fusion a priori CM on the fine grid. The modified vector {\widetilde{\mathbf{\alpha}}}_{i} is:
The fused profile and its characterisation matrices are then obtained by substituting {\widetilde{\mathbf{S}}}_{i} and {\widetilde{\mathbf{\alpha}}}_{i} into Configuration A, with the additional factor \mathbf{R}_{i}:
In this configuration \mathbf{S}_{\text{nf}} includes both the original noise errors and the interpolation errors. The smoothing errors are unchanged in structure.
Configuration C — Different vertical grids, imperfect coincidence
This is the most general case: the N measurements are retrieved on different grids and are also not perfectly co-located in space or time. An additional coincidence error term \mathbf{S}_{\text{coin}} — the CM describing the variability of the true profiles at the different measurement locations/times — is added to the modified total error CM (Ceccherini et al., 2022).
The modified total error CM becomes:
The vector {\widetilde{\mathbf{\alpha}}}_{i} is the same as in Configuration B (Eq. B.2). The fused profile and its characterisation matrices are given by the same equations (B.3)–(B.7), with {\widetilde{\mathbf{S}}}_{i} now defined by (C.1) instead of (B.1).
Configuration B is recovered as the special case \mathbf{S}_{\text{coin}} = 0, and Configuration A is recovered when additionally \mathbf{R}_{i} = \mathbf{I} and \mathbf{C}^{(i)} = \mathbf{C}^{(f)}.
Summary of inputs per configuration
| Input | Config A | Config B | Config C |
|---|---|---|---|
| \widehat{\mathbf{x}}_{i},\, \mathbf{A}_{i},\, \mathbf{S}_{i} | ✓ | ✓ | ✓ |
| \mathbf{x}_{a},\, \mathbf{S}_{a} (fusion a priori) | ✓ | ✓ | ✓ |
| \mathbf{H}_{i},\, \mathbf{R}_{i},\, \mathbf{C}^{(i)},\, \mathbf{C}^{(f)},\, \mathbf{S}_{\text{a,fine}} | — | ✓ | ✓ |
| \mathbf{S}_{\text{coin}} | — | — | ✓ |
