The CDF is a data processing method (Ceccherini et al., 2015) [RD1] that allows to combine several independent measurements of an atmospheric vertical profile retrieved with the optimal estimation method (Rodgers, 2000) [RD7]. We suppose to have N independent simultaneous measurements of the vertical profile of an atmospheric parameter referred to a specific geolocation. Performing the retrieval of the N measurements, we obtain N vectors \hat{x}_i \; (i = 1, 2, \dots, N) that provide independent estimates of the profile on a generic vertical grid. The vectors \hat{x}_i are characterized by the covariance matrices (CMs) of the noise errors S_{ni} and by the averaging kernel matrices (AKMs) A_i.
The CDF solution is obtained minimizing the following cost function:
c(\mathbf{x}) = \sum_{i=1}^{N} (\mathbf{a}_i - \mathbf{A}_i \mathbf{x})^{T} \mathbf{S}_{ni}^{-1} (\mathbf{a}_i - \mathbf{A}_i \mathbf{x}) + (\mathbf{x} - \mathbf{x}_a)^{T} \mathbf{S}_{a}^{-1} (\mathbf{x} - \mathbf{x}_a)
where x_a and S_a are respectively the a priori profile and CM used to constrain the fused profile and
a_i = \hat{x}_i - x_{ai} + A_i x_{ai}

CDF in perfect coincidence and on a common vertical grid

The original CDF solution (Ceccherini et al., 2015) is given by:
x_f = \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i + S_a^{-1} \right)^{-1} \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} a_i + S_a^{-1} x_a \right)
and is characterized by the AKM and by the CMs of the noise errors S_{nf}, of the smoothing errors S_{sf} and of the total error S_{f}, given by:
A_f = \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i + S_a^{-1} \right)^{-1} \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i
S_{nf} = \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i + S_a^{-1} \right)^{-1} \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i + S_a^{-1} \right)^{-1}
S_{sf} = \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i + S_a^{-1} \right)^{-1} S_a^{-1} \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i + S_a^{-1} \right)^{-1}
S_f = S_{nf} + S_{sf} = \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i + S_a^{-1} \right)^{-1}
The previous formulas contain the inverse matrices of the CMs of the noise errors S_{ni}, which often are singular matrices, therefore, in such cases we have to replace the inverse matrices of S_{ni} with the generalized inverse matrices (Kalman, 1976) [RD6] of S_{ni}. The use of the generalized inverse matrices implies an approximation in the solution and also the need of the definition of the threshold for the eigenvalues of S_{ni} for which eigenvalues smaller than this threshold have their inverses replaced with zeros. Too small values for this threshold determine significant numeric noise in the products; on the other hand, too large values of this threshold determine a loss of useful information. To overcome the problems related to the inversion of S_{ni} a new formulation of the equations of the CDF has been presented in Ceccherini et al. (2022) [RD4]. This formulation can be derived by the Kalman filter method (Kalman, 1960 [RD5] and Ceccherini, 2022 [RD3]) and is equivalent to the original formulation of the CDF when the matrices S_{ni} are not singular. The equations of the new formulation do not include the inverse matrices of S_{ni}, but they include the inverse matrices of the CMs of the total errors S_{i}, which are never singular. In order to distinguish the two formulations of the CDF we refer to the old formulation as CDF(2015) and to the new formulation as CDF(2022).
The equations of CDF(2022) are:
x_f = \left( \sum_{i=1}^{N} S_i^{-1} A_i + S_a^{-1} \right)^{-1} \left( \sum_{i=1}^{N} S_i^{-1} a_i + S_a^{-1} x_a \right)
A_f = \left( \sum_{i=1}^{N} S_i^{-1} A_i + S_a^{-1} \right)^{-1} \sum_{i=1}^{N} S_i^{-1} A_i
S_{nf} = \left( \sum_{i=1}^{N} S_i^{-1} A_i + S_a^{-1} \right)^{-1} \sum_{i=1}^{N} S_i^{-1} A_i \left( \sum_{i=1}^{N} S_i^{-1} A_i + S_a^{-1} \right)^{-1}
S_{sf} = \left( \sum_{i=1}^{N} S_i^{-1} A_i + S_a^{-1} \right)^{-1} S_a^{-1} \left( \sum_{i=1}^{N} S_i^{-1} A_i + S_a^{-1} \right)^{-1}
S_f = S_{nf} + S_{sf} = \left( \sum_{i=1}^{N} S_i^{-1} A_i + S_a^{-1} \right)^{-1}

Application of the CDF to Multi-target retrievals (MTRs)

The CDF algorithm described in the previous subsections is limited to retrieval products of a single atmospheric parameter, but it can be extended to deal with MTR products, whose state vectors include more atmospheric parameters. This extension is described in Tirelli et al. (2021) [RD9] for the formulation CDF(2015). For simplicity, we consider the data fusion between two products obtained by MTRs exactly co-located in space and time and referred to the same vertical grid. If the two retrieved state vectors contain the same parameters, the standard formulas of the CDF described in the previous subsections can be applied. If the two retrieved state vectors contain different parameters, but at least one parameter is common, then the inputs to the CDF have to be modified. We do the example of two instruments in which the state vector of the first one contains vertical profiles of the parameters P1 and P2 and the state vector of the second one contains vertical profiles of the parameters P1 and P3:
\hat{x}_1 = \begin{pmatrix} \hat{P}_{11} \\ \hat{P}_{21} \end{pmatrix} \quad \hat{x}_2 = \begin{pmatrix} \hat{P}_{12} \\ \hat{P}_{32} \end{pmatrix}

The two vectors \hat{x}_1 and \hat{x}_2 are characterized by the AKMs A_1 and A_2 and by the noise CMs S_{n1} and S_{n2}. The structures of these matrices are the following:

A_1 = \begin{pmatrix} A_{11,1} & A_{12,1} \\ A_{21,1} & A_{22,1} \end{pmatrix} \qquad A_2 = \begin{pmatrix} A_{11,2} & A_{13,2} \\ A_{31,2} & A_{33,2} \end{pmatrix}
S_{n1} = \begin{pmatrix} s_{11,n1} & s_{12,n1} \\ s_{21,n1} & s_{22,n1} \end{pmatrix} \qquad S_{n2} = \begin{pmatrix} s_{11,n2} & s_{13,n2} \\ s_{31,n2} & s_{33,n2} \end{pmatrix}

Where:

A_{rq,i} = \frac{\partial \hat{P}_{ri}}{\partial P_q} \qquad i = 1,2 \quad r = 1,2,3 \quad q = 1,2,3
S_{rq,ni} = \left( (\hat{P}_{ri} - \langle \hat{P}_{ri} \rangle) (\hat{P}_{qi} - \langle \hat{P}_{qi} \rangle)^T \right) \qquad i = 1,2 \quad r = 1,2,3 \quad q = 1,2,3
To apply the CDF method, the state vectors are modified to be the union of the parameters retrieved from the different measurements and new AKMs and noise CMs are created, adding submatrices related to the non-retrieved parameters and considering that no information is retrieved for them. The new input vectors for the CDF, to be performed with CDF(2015), are:
\hat{x}'_{1} = \begin{pmatrix} \hat{P}_{11} \\ \hat{P}_{21} \\ 0 \end{pmatrix} \qquad \hat{x}'_{2} = \begin{pmatrix} \hat{P}_{12} \\ 0 \\ \hat{P}_{32} \end{pmatrix}
A'_{1} = \begin{pmatrix} A_{11,1} & A_{12,1} & 0 \\ A_{21,1} & A_{22,1} & 0 \\ 0 & 0 & 0 \end{pmatrix} \qquad A'_{2} = \begin{pmatrix} A_{11,2} & 0 & A_{13,2} \\ 0 & 0 & 0 \\ A_{31,2} & 0 & A_{33,2} \end{pmatrix}
S'_{n1} = \begin{pmatrix} s_{11,n1} & s_{12,n1} & 0 \\ s_{21,n1} & s_{22,n1} & 0 \\ 0 & 0 & 0 \end{pmatrix} \qquad S'_{n2} = \begin{pmatrix} s_{11,n2} & 0 & s_{13,n2} \\ 0 & 0 & 0 \\ s_{31,n2} & 0 & s_{33,n2} \end{pmatrix}

Since these new noise CMs contain some rows and columns equal to zero, they are singular matrices and for their inversion it is needed to resort to the use of the generalized inverse. Using the new matrices (Eqs. (39)–(41)), as input to the CDF algorithm, we obtain a solution that contains elements in common and not in common:

\hat{x}_f = \begin{pmatrix} \hat{P}_{1} \\ P_{2f} \\ P_{3f} \end{pmatrix} \qquad A_f = \begin{pmatrix} A_{11,f} & A_{12,f} & A_{13,f} \\ A_{21,f} & A_{22,f} & A_{23,f} \\ A_{31,f} & A_{32,f} & A_{33,f} \end{pmatrix} \qquad S_{nf} = \begin{pmatrix} S_{11,nf} & S_{12,nf} & S_{13,nf} \\ S_{21,nf} & S_{22,nf} & S_{23,nf} \\ S_{31,nf} & S_{32,nf} & S_{33,nf} \end{pmatrix}

CDF improves the knowledge of the common parameter, but it also improves the knowledge of the parameters that are observed only by one of the two instruments and the gain in the information content for the parameters not in common is directly connected to the level of correlation between the parameter in common and those not in common.