The CDF algorithm can be extended beyond the standard configurations to handle more complex measurement scenarios and to address important practical considerations for real-world implementation.
Extension to Non-Overlapping Vertical Grids
The basic CDF formulations assume that the fused product grid is contained within the vertical range covered by all input measurements. This constraint limits applicability in cases where different instruments measure different altitude ranges.
An extended algorithm, inspired by the multi-target retrieval methodology, allows fusion on a union of vertical grids rather than their intersection. Profiles are padded with zeros (or missing-data markers) at altitudes where no measurement is available, and modified averaging kernel and covariance matrices are constructed accordingly.
Extended Grid Approach
Consider three profiles on different vertical grids:
- Profile 1: Defined only in upper and central altitudes (P11, P21)
- Profile 2: Defined only in central and lower altitudes (P22, P32)
- Profile 3: Complete coverage (P13, P23, P33)
In the standard approach, only the central region (P2) could be used for fusion. With the extended algorithm, we extend all profiles to cover the complete grid:
The matrices are extended with zeros (or identity/null blocks) in corresponding positions. After fusion, the result contains estimates across the entire extended grid:
- Upper region: Constrained by Profiles 1 and 3 only
- Central region: Constrained by all three profiles
- Lower region: Constrained by Profiles 2 and 3 only
Important: This extension is original and requires thorough testing before operational deployment. The handling of region boundaries and the physical consistency of the fused product across altitude discontinuities in coverage must be carefully validated.
Critical Aspects of Numerical Implementation
The CDF algorithm, while theoretically elegant, presents several practical challenges when implemented numerically. Understanding these issues is essential for robust, reliable applications.
Matrix Inversion Issues
CDF(2015): Requires inversion of the noise covariance matrix \mathbf{S}_{ni}, which is often singular or near-singular. The algorithm employs Moore-Penrose pseudo-inversion via singular value decomposition (SVD):
The choice of eigenvalue threshold for SVD is non-objective and can significantly affect results. Very small thresholds permit numerical noise; very large thresholds suppress useful information.
CDF(2022): Avoids inverting \mathbf{S}_{ni} by instead inverting the total error covariance \mathbf{S}_{i} = \mathbf{S}_{ni} + \mathbf{S}_{si}, which is typically well-conditioned and invertible. This is a major advantage and is the primary reason CDF(2022) is recommended for operational use.
Numerical Errors
Two types of numerical error can be problematic:
- Truncation errors: Arise from discretizing continuous equations and operations
- Roundoff errors: Accumulate in floating-point arithmetic, especially problematic when operating on matrices with large condition numbers
These errors become critical for state vectors with large dynamic range — when state vector components span many orders of magnitude and interact with covariance matrices of similarly disparate scales.
Diagnostic Parameters
To identify datasets prone to numerical problems, monitor these quantities:
- Eigenvalue dynamics of \mathbf{S}_{ni}: Ratio of largest to smallest eigenvalue. Large ratios indicate ill-conditioning.
- Eigenvalue dynamics of \mathbf{S}_{i}: Similarly indicative of conditioning; typically better than \mathbf{S}_{ni}.
- State vector dynamics: Ratio of largest to smallest magnitude component. Indicates potential scaling issues.
Recommendation: For problematic datasets, consider:
- Normalizing state vectors to similar scales before fusion
- Using robust matrix factorization methods (QR, Cholesky) in place of direct inversion
- Iterative refinement of solutions in ill-conditioned cases
Detailed investigation of numerical stability and best practices for different atmospheric regimes remains an important area for future research.
