imagery of complex urban scenes. Their network consists of two streams: one designed for extracting features from optical images and one responsible for learning features from SAR images. Next, the extracted features are fused via a concatenation layer for further binary prediction of their correspondence. A selection of true positives, false positives, false negatives, and true negatives of SAR-optical image patches from [58] is presented in Figure 9. Similarly, Synthetic Wrapped Interferogram Synthetic Decorrelation Mask Synthetic Phase Gradients Gradients, y Gradients, x CNN Input Training Desired Outputs (a) Trained CNN Wrapped Interferogram Estimated Decorrelation Mask Estimated Phase Gradients Gradients, y Gradients, x (b) Estimated Unwrapped Phase (W) Deformation Map (Wm) Wm = W.λ/4π Phase Unwraping Deformation Score (DEF) DEF = std - dev (Wm) (c) - Time Series and Deformation Maps (Public Website) - email Alert If DEF > 0.001 (Private List) (d) FIGURE 8. The workflow of the volcano deformation (DEF) detection proposed in [168]. The CNN is trained on simulated data and later used to perceive phase gradients and a decorrelation mask from the input wrapped interferograms to locate ground deformation caused by volcanoes. (a) The CNN training. (b) The phase gradient detection. (c) The phase unwrapping and score computation. (d) The dissemination. DECEMBER 2021 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 157 Real Data Synthetic Training Data