Evaluating atmospheric instability from high spectral
Transcript
Evaluating atmospheric instability from high spectral
Evaluating atmospheric instability from high spectral resolution IR satellite observations P. Antonelli ([email protected]), A. Manzato ([email protected]), S. Puca ([email protected]), F. Zauli ([email protected]) April 11, 2011 Abstract This document describes the activities performed to derive atmospheric instability in clear sky from high spectral resolution IR data observed by the IASI instrument on board of METOP-A. High spectral res olution has been proved to carry information on the vertical structure of the atmosphere (temperature and water vapor concentration) at higher vertical resolution than any other observation system operated on board of satellites. This information was expected to allow for characterization the pre-convective environment in clear sky conditions. Goal of the study was the development of an automated system for identification of potentially unstable air masses through the combination of IASI level 3 products (insta bility indices derived from atmospheric temperature and water vapor vertical profiles obtained through a physical retrieval), and linear combination of IASI level 1 radiances. Results obtained were compared to those obtained using instability indices derived from high vertical resolution rawinsondes instead of IASI data and products, and demonstrated the feasibility of an automatic system for the near-casting of convective events in clear sky conditions. Outcomes of this study are expected to lead, in the future, to the implementation of near-casting operational applications. Contents 1 Technical Reports 6 and 7: Prediction of convective event: intercomparison of results obtained from IASI data with those obtained from rawinsondes, over Po Valley. 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 1.2 Description of the Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Event climatology . . . . . . . . . . . . . . . . . . . . . Nowcasting convective events from rawinsondes . . . . . . . . 1.2.1 Full data set for Rawinsondes . . . . . . . . . . . . . . 1.2.2 Predicting events from rawinsonde derived indices . . . 1.2.2.1 Empirical posterior probability for rawinsonde 1.2.2.2 Forward selection algorithm . . . . . . . . . . 1.2.3 8 9 9 10 10 10 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3.1 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 15 15 Nowcasting convective events from IASI . . . . . . . . . . . . . . . 1.3.1 Full data set for IASI . . . . . . . . . . . . . . . . . . . . . . 1.3.1.1 Predictors derived from IASI retrievals . . . . . . . 1.3.1.2 Predictors derived from IASI principal components 1.3.2 Predicting events from IASI derived indices and PCS . . . . 1.3.2.1 Empirical posterior probability for IASI indices . . 1.3.2.2 Forward selection algorithm . . . . . . . . . . . . . 1.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.3.1 Training . . . . . . . . . . . . . . . . . . . . . . . . 1.3.3.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . 1.3.4 Discussion of the results . . . . . . . . . . . . . . . . . . . . 1.3.4.1 Limited size of the IASI Full data set . . . . . . . . 1.3.4.2 Increasing the Full IASI data set . . . . . . . . . . 1.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 . . . . . . 1.3 . . . . . . . . . . . . . . . . . . . . indices . . . . . . 5 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 15 16 20 20 20 20 24 24 24 25 25 28 29 2 Technical Report 5: validation of Level 3 products derived from vertical rawinsonde and retrieval profiles with occurrence of convection as detected by Lightnings. 32 1 Ref.: PA/IIS/FR/2010/01 2.1 2.2 2.3 2.4 Relationship between Instability Indices and convection occurrence . . . . . . . . . . . . . 2.1.1 Linear correlation between instability binary indices and convection occurrence . . 2.1.2 Cross-entropy error between instability indices and convection occurrence . . . . . 32 33 33 2.1.3 Skill scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Udine Campoformido: Linear correlation between instability binary indices and con 33 34 vection occurrence and cross-entropy . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Udine Campoformido: skill scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Pratica di Mare: Linear correlation between instability binary indices and convection occurrence and cross-entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.4 Pratica di Mare: skill scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.5 Cagliari: Linear correlation between instability binary indices and convection occur- 34 34 34 38 rence and cross-entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.6 Cagliari: skill scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 42 Analysis of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 46 3 Technical Report 4: Comparison of Level 3 products (Instability Indices) derived from satellite observations and rawinsondes 48 3.1 Generation of Instability Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.2 3.3 3.4 3.1.1 Lifted Parcel Theory assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Lifted Parcel Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 Selection of Instability Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 49 50 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Udine Campoformido . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Pratica di Mare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 52 52 3.2.3 Cagliari . . . . . . . . . . . . . . . . . Analysis of Results . . . . . . . . . . . . . . . 3.3.1 Forecast derived indices . . . . . . . . 3.3.2 Time dependence of instability indices Conclusions . . . . . . . . . . . . . . . . . . . . . . . . 52 54 54 54 56 . . . . . . 59 59 59 60 60 60 60 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Technical Report 3: Validation of baseline Retrieval with rawinsondes 4.1 Inversion with UWPHYSRET . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 IASI observations used in retrieval . . . . . . . . . . . . . . . . . . 4.1.2 A-priori Covariance: in-situ observations used for climatology . . . 4.1.3 Error Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.4 Forward Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.5 Minimization Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ref.: PA/IIS/FR/2010/01 4.1.6 Convergence Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.1.7 Retrieval Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Validation strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Spectral Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 62 4.2.2 . . . . . . . . . . . . 63 63 63 . . . . . . . . 64 64 Pratica di Mare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cagliari . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 69 4.4 Analysis of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.2 4.3 Environmental Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2.1 In situ observations used for environmental validation . . . . . . . . 4.2.2.2 Statistical quantities used to characterize environmental validation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Udine Campoformido . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 4.3.3 5 Technical Report 2: Instability Indices (Level 3 Products) derived from IASI retrievals (Level 2 Products) 79 5.1 5.2 5.3 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Retrievals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 79 81 5.4 5.5 Instability Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 82 6 Technical Report 1: Dataset description 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 6.3 6.4 IASI data . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Pratica di Mare (lat : 41.65N, lon : 12.43E) . . . 6.2.2 Udine Campoformido (lat : 46.03N, lon : 13.18E) 6.2.3 Cagliari (lat : 39.25N, lon : 9.05E) . . . . . . . . Lightning . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Pratica di Mare . . . . . . . . . . . . . . . . . . . . 6.3.2 Udine, Campoformido . . . . . . . . . . . . . . . . 6.3.3 Cagliari . . . . . . . . . . . . . . . . . . . . . . . . Rawinsondes . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 Pressure interpolated profiles . . . . . . . . . . . . 6.4.2 Pratica di Mare . . . . . . . . . . . . . . . . . . . . 6.4.3 Udine, Campoformido . . . . . . . . . . . . . . . . 6.4.4 Cagliari . . . . . . . . . . . . . . . . . . . . . . . . 7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 89 89 90 90 94 96 97 97 100 100 100 102 102 102 107 3 Introduction This document is a collection of seven technical reports (chapters) presented in reverse chronological order to allow the reader to go through the final results without requiring the vision of all the propedeutic parts (anyhow included). Chapter 1, includes the last two technical reports, 6 and 7, of the project. It focuses on the implementation of a prototype prediction system for near-casting of convective events defined as the occurrence of more than 10 lightnings strikes (with at least 1 second separation for nearby strikes) over the area of interest (Po Valley in Italy), between 11:00 UTC and 17:00 UTC, for the annual time period of April-October. The described effort aims to investigate and compare the capacity of predicting a convective events using IASI data and products on one side, and rawinsondes products on the other side. Chapter 2 (technical report 5), propedeutic to chapter 1, is dedicated to the validation of the instability indices derived from IASI data and those derived from high vertical rawinsondes with the lightning occurrence (convective event observations) in the areas of interests over Pratica di Mare, Cagliari, and Udine, Italy. Chapter 3, technical report 4, provides a comparison of the instability indices, derived from satellite observations and rawinsondes. Chapter 4, technical report 3, going backwards, describes the results obtained validating baseline retrieval (level 2 products) derived from IASI with available high vertical resolution rawinsondes launched within 50 km and about 100 minutes from satellite observations. Chapters 5 and 6, which include respectively technical reports 2 and 1, are dedicated to the description of the software package used to derive instability indices from vertical profiles of atmospheric temperature and water vapor, and to the dataset descriptions. Finally chapter 7 reproduces the conclusions of the main chapter (chapter 1) in a more general form. 4 Chapter 1 Technical Reports 6 and 7: Prediction of convective event: intercomparison of results obtained from IASI data with those obtained from rawinsondes, over Po Valley. Document: Technical Reports 6 and 7 Written by: Paolo Antonelli, A. Manzato Date: 09 March 2011 Reference: PA/IIS/TR06/2011/01 1.1 Introduction The presented effort aims to investigate and compare the capacity of predicting a convective event over the Po Valley using: 1) rawinsondes products, 2) IASI data and products. In the study a convective event was defined as the occurrence of more than 10 lightnings strikes (with at least 1 second separation for nearby strikes) over the area of interest, between 11:00 UTC and 17:00 UTC. In the first approach 11:00 UTC rawinsondes obtained from the sites of Milano Linate and Udine Campoformido were used to generate sets of 50 instability indices per site. Among these indices 8 predictors were selected, by a forward selection algorithm, as best predictors and fed to an ANN which produced two contingency tables for Training and Test sets associated to Peirce Skill Scores (PSS) of 0.67 and 0.68 respectively. In the second approach 30 instability indices (only the ones not depending on winds) were derived from IASI level 2 products generated by a physical retrieval approach (UWPHYSRET) over 2 areas centered in Milano and Udine. Twenty linear combinations of IASI channels (Principal Component Scores) were added to the 30 indices as potential predictors. Three best predictors were selected and fed to an ANN which produced two contingency tables for Training and Test sets associated to PSS scores of 0.67 and 0.33 respectively. Poor results obtained 5 Ref.: PA/IIS/FR/2010/01 in the generalization of the prediction of convective event from IASI data and products, were found to be mostly dependent on the limited size of the IASI database (available retrievals in clear sky conditions) which is a factor 10 smaller than the rawinsonde database (both for training and testing). However a general tendency of the retrievals to overestimate low level water vapor, which led to overestimation of the atmospheric instability, was found and should be further investigated. Finally by focusing on a single area of interest we were able to increase the size of the IASI database by a factor two, and the prediction PSS score on the test set reached the value of 0.57, indicating that nowcasting of convection, by IASI data over individual (smaller) areas, is feasible and promising. Besides the final scores, significance of the presented material relies on the correlation found between some of the PCS and the occurrence of convection, and on the validation of the IASI level2 and level 3 products. This document describes the statistical links found between the instability predictors derived either from 11:00 UTC rawinsondes (instability indices [15]) or from morning overpass IASI data (instability indices and principal component scores), and the occurrence of convective events, defined by the observation of 10 or more lightnings between 11:00 UTC and 17:00 UTC, over the Po Valley, for the seasonal period that goes from beginning of April to the end of October. The document is divided in four sections respectively dedicated to: 1) introduction to the experiment; 2) nowcasting convective events from rawinsondes; 3) nowcasting convective events from IASI data; 4) conclusions on achieved results. 1.1.1 Description of the Experiment The goal of the experiment is to investigate and compare the capacity of predicting a convective event over a designated area using: • rawinsondes products; • IASI data and products. In the study a convective event is defined as the occurrence of more than 10 lightnings strikes (with at least 1 second separation if the LAT-LON coordinates are very close) over the region indicated by the yellow area in figure 1.1, between 11:00 UTC and 17:00 UTC. Rawinsonde sites of Milano Linate (45.45°N, 9.27°E) and Udine Campoformido (46.02°N, 13.16°E) are indicated by the green markers, and are located in ideal position to characterize boundary conditions for the area of interest. The capacity of predicting was measured by calculating the Peirce skill score (PSS [8, 12, 13]) of a binary classifier obtained from a set of continuous predictors using thresholds to dichotomize the predictor values into event occurrence and non-occurrence classes. The strategy used to achieve the goal is summarized as follows: • define the occurrence of a convective event, by setting a threshold (of 10 strikes) to convert the discrete distribution of lightning strikes into a binary output event yes/no (1/0); • build a Full Dataset with occurrence of convective event (yes/no) and the values of all available predictors; 6 Ref.: PA/IIS/FR/2010/01 – divide the full dataset into 2 subsets: 1) Total Set, to be used to build the classifier; 2) Test Set, to be used for the final evaluation of the classifier capacity of prediction; – divide, in 12 different ways, the Total Set into 2 subsets: 1) Training Set (75%); 2) Validation Set (25%); both to be used to subselect the optimal predictors using the Repeated Holdout Technique [18]; • define, for each predictor, an empirical posterior probability, i.e. a mathematical relationship which associate the probability of having an event to the continuous values of the predictors (figures 1.4, 1.5) and use it as pre-processing; • implement a forward selection algorithm (based on Artificial Neural Networks, namely a single layer, feedforward network trained with backpropagation [9, 14]) to choose an optimal subset of predictors. The ANN chooses at first the one predictor that gives the best classification of the event occurrences starting from its empirical probability distribution. Then it selects the predictor which gives the best fit, when used together with the first one. New predictors are added, until the system predictive skill stop increasing. The number of input predictors was chosen taking into consideration the mean skill of the 12 ANN built with the different instances of the Training Sets. Prediction skill of the ANN was measured by the mean cross-entropy error (CEE): CEE = − N 1 � [tn ln (yn ) + (1 − tn ) ln (1 − yn )] N n=1 (1.1) where yn is the output of the ANN, and tn is boolean for the the convective event (1|yes, 0|no), calculated over the 12 instances of the Validation Sets. The cross-entropy can be used as an error measure when a network’s outputs can be thought of as representing independent hypotheses (e.g. each node stands for a different concept), and the node activations can be understood as representing the probability (or confidence) that each hypothesis might be true. In that case, the output vector represents a probability distribution, and our error measure - cross-entropy - indicates the distance between what the network believes this distribution should be, and what the teacher says it should be [16]. CEE = 0 is a pefect score and CEE � 1 is a poor score. • once the optimal subset of predictors was identified, the final ANN architecture was chosen among different candidates (different numbers of hidden neurons in the hidden layer) as the one that has the lowest combined CEE on the Total set (Training + Validation) without overfitting it, that is, with similar performances also on the independent Test set; • quantitative evaluation of the learning and generalization of the knowledge during the ANN super vised training has been performed using the Receiver Operating Characteristic (ROC) [19, 12]. A ROC curve summarizes the performance of a two-class classifier across the range of possible thresh olds. An ideal classifier hugs the left side and top side of the graph, and the area under the curve is 1.0. A random classifier should achieve an area of approximately 0.5 and lies along the 45 degree bisector 7 Ref.: PA/IIS/FR/2010/01 Prediction: YES Prediction: NO Event (N) b d Event (Y) a c Table 1.1: Example of contingency table Score Expression POD POFD FAR HIT BIAS O a a+c b b+d b a+b a+d a+b+c+d a+b a+c ad bc Score HSS PSS E(PSS) Expression 2(ad−bc) (a+c)(c+d)+(a+b)(b+d) (ad-bc) (a+c)(b+d) N 2 −4(a+b)(c+d)KSS 2 4N (a+b)(c+d) Table 1.2: Scores derived from the contingency table (a classifier with an area less than 0.5 can be improved simply by flipping the class assignment). The ROC curve is recommended for comparing classifiers, as it does not merely summarize per formance at a single arbitrarily selected decision threshold, but across all possible decision thresholds [http://www.statsoft.com/textbook/statistics-glossary/r/button/r/]. Once the output of the ANN was dichotomized using the event prior probability as threshold, the contingency table 1.1 was calculated, and the different statistical scores (table 1.2) were determined. 1.1.2 Event climatology The database of convective events, for the time period of 2004-2010 was developed by Osservatorio Me teorologico Regionale (OSMER) of Agenzia Regionale per la Protezione dell’Ambiente del Friuli Venezia Giulia (ARPA-FVG), using lightning data already bought from the CESI/SIRF company. Climatology of the observed number of lightnings in 6h (without the 0 cases), for the time period under consideration, and for the area of interest, is showed by the solid line in figure 1.2. The event distribution of the number lightning strikes was also fit using the Pareto distribution: N (x) = 50·t·(xmin )t·x−(t+1) (1.2) with xmin = 50, and t = 0.33 (dashed line in Figure 1.2), a threshold of 10 strikes was chosen to define a convective event. Selection of the threshold value was done subjectively, considering 10 strikes enough to guarantee a significant (with respect to the area size) convective event. 600 cases in about 1500 days ( 40.0 %) showed at least 10 lightnings strikes in a 6h period between 11:00 and 17:00 UTC. The lightning distribution was found to have M edian = 1, M ean = 211.63, Standard Deviation = 647.19. No lightning activity was recorded on 696 cases. In 898 cases lightning activity was associated to less then 10 strikes over the whole area. The number of cases which had one strike only was 802 and the maximum activity was observed on 30 Aug 2007, and 8174 strikes were counted. By setting the threshold of at least 10 lightning 8 Ref.: PA/IIS/FR/2010/01 Figure 1.1: Event Area strikes to define a convective event, the probability of an event over the whole dataset was estimated to be around 40%. 1.2 Nowcasting convective events from rawinsondes Rawinsonde data for this study, collected between 2004 and 2009 were provided by Centro Nazionale per la Meteorologia e la Climatologia Areonautica (CNMCA), while rawinsonde data for 2010 were retrieved from the University of Wyoming archive. Locations of the rawinsonde launching sites are showed by the green markers in figure 1.1. It is worth emphasizing that only rawinsondes launched at 11:00 UTC were taken into consideration in this study. 1.2.1 Full data set for Rawinsondes The 1433 coincident rawinsondes available for Udine and Milano were used to generate 50 instability indices (table 1.3) using Sound_Analys.py, the software package developed at OSMER [15, 11]. The rawinsonde Full data set was built using the instability indices calculated over the two individual sites (2 subsets), their differences (Milano - Udine, third subset), plus 4 combinations of some indices for each of the 3 subsets, for a total of 50 • 3 + 12 = 162 variables. Including also the Julian date (JJJ) as possible predictor, the Full data set contained therefore 1433 cases with 163 candidate predictors and 1 boolean (event YES|NO) to be predicted. However candidate predictors with a high number of missing values, such as Level of Free 9 Ref.: PA/IIS/FR/2010/01 Figure 1.2: Probability distribution of convective events Convection (LFC) and Equilibrium Level (EL), that are defined only for potentially unstable profiles, were not considered, reducing effectively the number of used candidate predictors to 157. 1.2.2 Predicting events from rawinsonde derived indices 1.2.2.1 Empirical posterior probability for rawinsonde indices According to the strategy described in sec. 1.1.1, once the Full data set was assembled, the Empirical Posterior Probability (EPP) functions were determined by fitting ad-hoc curves to the distribution of the event likelihood probabilities. Figures 1.3, 1.4, and 1.5, show examples of the EPP, respectively for the Julian Day, and for two of the predictors (CAPE, and DT500). The complete sets of EPP plots for the rawinsonde derived predictors can be found in ANNEX1. 1.2.2.2 Forward selection algorithm The EPP of the indices, generated over Udine and Milano, and and those of their differences, were then used to determine the optimal subset of predictors to forecast the occurrence of a convective event still following the procedure described in section 1.1.1. Figure 1.6 shows the best 8 predictors found among 10 Ref.: PA/IIS/FR/2010/01 P(JJJ) [] BOY [] Julian day Boyden index SWEAT [] MEL [%] BRI [] Bulk Richardson number MLWu [ms−1 ] BS850 [ms−1 ] Bulk Shear 850 hPa - 100 m MLWv [ms−1 ] CAP [o C] Maximum cap (as Θes difference) Convective available potential energy Convective inhibition MRH [%] Difference of temperature at 500 hPa Core difference of temperature Downdraft potential PBL [m] EHI [] HD [cm] Energy–helicity index Hail Diameter (derived from UpDr) Rel_Hel [m2 s−2 ] SWISS [] HLJD [m] High–levels (6–12 km) jet depth U component of high–levels (6–12 km) wind V component of high–levels (6–12 km) wind High–levels (500–300 hPa) relative humidity Helicity K index Lifting condensation level Shear [s−1 ] Thetae [K] Trop [m] UpDr [m/s] Level of free convection height Lifted index VFlux [m−2 s−1 kg] VV [ms−1 ] Low–levels (lowest 6 km) jet depth U component of low-level wind (0.5 km) V component of low-level wind (0.5 km) Mean relative humidity in the first 250 hPa VVstd [ms−1 ] CAPE [J/kg] CIN [J/kg] DT500 [o C] DTC [o C] DownPotm [K] HLWu [ms−1 ] HLWv [ms−1 ] HRH [%] HEL [Jkg −1 ] KI [o C] LCL [m] LFC [m] LI [C] LLJD [m] LLWu [ms−1 ] LLWv [ms−1 ] LRH [m] MaxBuo [K] Mix [g/kg] PWC [mm] PWE [mm] Shear3 [s−1 ] ShowI [o C] Tbase [o C] WBZ [m] b_PBL [cm/s2] h_MUP [m] Table 1.3: Instability Indices 11 Severe weather threat Melting level (parcel at 0°C) U component of midlevel wind (6 km) V component of midlevel wind (6 km) Mean relative humidity in the first 500 hPa Maximum buoyancy Most unstable parcel (MUP) mixing ratio Planetary boundary layer estimated height Precipitable water content of cloud Precipitable water content of environment Relative helicity Stability and wind shear index for storms in Switzerland Wind shear in the lowest 12 km Wind shear in the lowest 3 km Showalter index Cloud-base (LCL) temperature Most Unstable Parcel Θe Tropopause height “Core updraft” (parcel at -15°C) Mean water vapor horizontal flux Radiosonde ascensional vertical velocity Std dev of radiosonde vertical velocity Environmental wet bulb zero height Mean buoyancy acceleration of the first 250 hPa MUP height Ref.: PA/IIS/FR/2010/01 Figure 1.3: Empirical relationship between Julian day and lightning occurrence. Peak of the activity is found in July. Figure 1.4: Empirical relationship between CAPE and occurrence of at least 10 lightning. On the left figure CAPE was derived from Udine Campoformido rawinsondes, while on the right it was derived from Milano Linate rawinsondes. 12 Ref.: PA/IIS/FR/2010/01 Figure 1.5: Empirical relationship between DT500 and occurrence of at least 10 lightning. On the left figure DT500 was derived from Udine Campoformido rawinsondes, while on the right it was derived from Milano Linate rawinsondes. Variable DTCu SWISSm WBZm KIu Mixu PWCm ShowIm KIm <VE > 0.488 0.433 0.402 0.375 0.363 0.351 0.354 0.347 < TE > 0.484 0.428 0.402 0.371 0.356 0.338 0.330 0.313 < T otal E > 0.485 0.429 0.402 0.372 0.358 0.342 0.336 0.321 Table 1.4: Forward selection algorithm: results for the best rawinsonde derived predictors. the 157 rawinsonde derived indices according to the mean CEE values of Validation Errors (VE), Training Errors (TE), and Total Errors (T otal E = .75T E + .25V E) also reported in table 1.4, where the letters u and m at the end of the variable names stand for Udine and Milano respectively. 1.2.3 Results During the input selection phase (forward selection algorithm) only the Total set was used, it included 949 cases and was used to train the different ANN candidates. While to select the best architecture (hidden neurons) for the prediction system (ANN) also the consistency between the results obtained on the Total and on the Test sets (of 350 cases) was taken into account. The architecture chosen was with 8 inputs, 2 neurons on the hidden layer, and 1 output. 13 Ref.: PA/IIS/FR/2010/01 Figure 1.6: The Training-Validation (TV) diagram of the CEEs computed over the 12 bootstraps (instances of training/validation sets) for the first 8 “best ANN-inputs” chosen by the classification forward selection algorithm. The sets of 12 TV points of each variable are represented alternatively by filled circles and triangles, while the unfilled squares, connected by a dashed line, show the mean errors over the 12-point bootstraps. 14 Ref.: PA/IIS/FR/2010/01 1.2.3.1 Training Application of the ANN on the Total set led to a Total CEE of 0.335, while applying the probability threshold (0.40) on the continuous ANN output led to the following contingency table: TOTAL Event (Y) Event (N) Prediction: YES 316 95 Prediction: NO 63 475 The analysis of the contingency table: 1.2.3.2 TOTAL POD HIT FAR POFD BIAS TS HSS PSS S(PSS) ODDS Score 0.83 0.83 0.23 0.17 1.08 0.67 0.66 0.67 0.02 25.08 Testing Applying the ANN on the Test set led to a Test CEE of 0.375, while entries found in the contingency table were: TEST Event (Y) Event (N) Prediction: YES 114 33 Prediction: NO 23 180 The analysis of the contingency table: TEST POD HIT FAR POFD BIAS TS HSS PSS S(PSS) ODDS Score 0.83 0.84 0.22 0.15 1.07 0.67 0.67 0.68 0.04 27.03 which are even slightly better than the scores obtained applying the ANN on the Total Set, and indicate a good generalization capacity of the network. The ROC curves for this ANN are shown in figure 1.7. 1.3 Nowcasting convective events from IASI The same procedure described in sec. 1.4 was repeated using predictors derived from IASI data. 1.3.1 Full data set for IASI IASI coverage started in 2007 and only 154 coincident retrieval/observations were found for Udine and Milano. The IASI Full data set was built using predictors calculated off IASI retrievals [15, 11, 6], and predictors derived form IASI observed radiances. The full set was built on 154 cases defined by 160 (CAPu and D_CAP were not used because of the large fraction of missing values) predictors and 1 boolean (event YES|NO) to be predicted. 15 Ref.: PA/IIS/FR/2010/01 Figure 1.7: ROC obtained by selected ANN for Total and Test rawinsonde data 1.3.1.1 Predictors derived from IASI retrievals Instability indices (the 30 which are not depending of wind direction and/or intensity) were generated from IASI level 2 products [6]. The retrievals were obtained inverting observations collected between April and October, from 2007 to 2010, over the blue areas in figure 1.8, by using UWPHYSRET [5, 3] with a local climatology (which included rawinsonde launched from Milano and from Udine at 05:00 and 11:00 UTC) for the characterization of the a-priori covariance. Retrievals were considered successful if the spectral residuals were within noise level, however all successful retrievals were divided into two categories: the ones whose profiles did not show evidence of water vapor saturation (218 profiles over Udine, and 294 over Milano) and those which showed evidence of potential saturation (556 profiles over Udine, and 753 over Milano). The first category was considered more reliable, however the saturated profiles had to be included, because of the very limited number of favorable cases. In fact, even with the potentially saturated profiles, after averaging all the cases observed in a single overpass (obtaining 283 cases over Udine, and 332 over Milano), and finding the intersection (coincident profiles) of the two sets of Milano and Udine, only 154 cases were left for the whole period 2007-2010. For each available retrieval the mean temperature and the mean water vapor mixing ratio, within the first 200 hPa of the atmospheric profile, were compared to those derived from the rawinsondes. Results showed that the mean retrieval temperature over Milano (figure 1.9) and Udine (figure 1.11) is, as expected, generally 1 − 2 K colder than the mean rawinsonde temperature, being the retrieval obtained about 90 minutes before the rawinsonde. Reds diamonds, associated to potentially saturated profiles, show more 16 Ref.: PA/IIS/FR/2010/01 Figure 1.8: IASI retrieval areas outliers and slightly worse statistics with respect to the blue diamonds associated with non saturated profiles. Results also showed that water vapor mixing ratio retrieved from IASI is generally overestimated with respect to values provided by the rawinsondes (figures 1.10, and 1.12). This is expected to lead to an overestimation of the instability because of the overestimation of Θe . Also for water vapor more outliers were found in the potentially unstable profiles. An additional sanity check on the data used to build part of the IASI Full data set was performed by calculating the linear correlation and the bias between instability indices derived from the retrievals and those derived from the rawinsondes. Results are shown in figures 1.13 and 1.14. Consistently with what described in [4], the indices which are not dependent on Lifted Parcel Theory showed higher correlation. Also, in general, correlation were found to be higher over Milano than over Udine, indicating either better performance of the retrieval system, and/or lower temporal and spatial variability of the atmospheric conditions over Milano. 17 Ref.: PA/IIS/FR/2010/01 Figure 1.9: Comparison of mean atmospheric Temperature in the lowest 200 hPa between retrievals and rawinsondes over Milano. Red and blue diamonds represent respectively mean values for potentially satu rated and non saturated profiles. Figure 1.10: Comparison of mean atmospheric water vapor mixing ratio in the lowest 200 hPa between retrievals and rawinsondes over Milano. Red and blue diamonds represent respectively mean values for potentially saturated and non saturated profiles. Figure 1.11: Comparison of mean atmospheric Temperature in the lowest 200 hPa between retrievals and rawinsondes over Udine. Red and blue diamonds represent respectively mean values for potentially saturated and non saturated profiles. 18 Ref.: PA/IIS/FR/2010/01 Figure 1.12: Comparison of mean atmospheric water vapor mixing ratio in the lowest 200 hPa between retrievals and rawinsondes over Udine. Red and blue diamonds represent respectively mean values for potentially saturated and non saturated profiles. Figure 1.13: Correlation of Instability Indices derived from the retrievals and the rawinsondes for the 150 coincident cases (Milano-Udine). Black diamonds show the correlation obtained over Milano, while magenta circles show the correlation found over Udine. Figure 1.14: Bias (normalized by the standard deviation) of Instability Indices derived from the retrievals and the rawinsondes for the 150 coincident cases (Milano-Udine). Black diamonds show the bias obtained over Milano, while magenta circles show the correlation found over Udine. 19 Ref.: PA/IIS/FR/2010/01 1.3.1.2 Predictors derived from IASI principal components Besides the use of instability indices derived from IASI level 2 products (retrieval), 20 potential predictors were generated using Principal Component Analysis. The procedure to generate the PCS is fully described in[2, 1, 7]. The set of 20 PCS predictors was created using 10 PCS for the IASI Long Wave (LW) band, 9 for the Mid Wave (MW) band, and 1 for the Short Wave (SW) band. The PCS were all associated to the Principal Components number 1, 2, 4, 5, 6, 7, 8, 9, 10, 12 for the LW; 1, 2, 4, 6, 7, 8, 9, 10, 17 for the MW; and 24 for the SW. Selection was based on the spectral structure of the Principal Components, trying to find the ones which had higher information on weak water vapor lines (especially in the LW). A more systematic analysis of individual components is part of the future activity. Some of the PCS were found to have an empirical posterior probability distribution (sec. 1.3.2.1) which could fit the lightning observation remarkably well. This was the case of PCS associated to the 10th LW PC (figure 1.17), which clearly appeared to be a good candidate to be chosen as predictors for nowcasting of convective events. In � addition to the 20 PCS, 12 non-linear combination of them (such as P CSx2 + P CSy2 ), were also included in the list of candidate predictors. 1.3.2 Predicting events from IASI derived indices and PCS 1.3.2.1 Empirical posterior probability for IASI indices According to the strategy described in sec. 1.1.1 and 1.2.2.1, once the Full data set was assembled, the EPP functions were determined by fitting ad-hoc curves to the distribution of the events. Figures 1.15, 1.16, 1.17, show examples of the EPP, respectively for the Julian Day, and for two of the predictors (CAPE, and PCS LW 10m). The complete sets of EPP plots for the IASI derived predictors can be found in ANNEX2. 1.3.2.2 Forward selection algorithm The forward selection algorithm applied to the indices, generated for the two areas around Udine and Milano, 20 IASI Principal Component Scores, along with their differences (Milano - Udine), produced the results in described in table 1.5,. Figure 1.18 shows the best 6 predictors found among the 164 rawinsonde derived indices according to the lowest mean values of Validation Errors (VE). Also the Training Errors (TE) and Total Errors are reported in table 1.5, where the letters u and m at the end of the variable names stand for Udine and Milano respectively, while the Greek letter ∆ indicate the difference between values generated over Milano and over Udine. 20 Ref.: PA/IIS/FR/2010/01 Figure 1.15: Empirical relationship between Julian day and lightning occurrence for the IASI Full data set. Peak of the activity is found in July. Figure 1.16: Empirical relationship between CAPE and occurrence of at least 10 lightning. On the left figure CAPE was derived from Udine Campoformido rawinsondes, while on the right it was derived from Milano Linate rawinsondes. 21 Ref.: PA/IIS/FR/2010/01 Figure 1.17: Empirical relationship between PCSL10 and occurrence of at least 10 lightning. On the left figure PCSL10 was derived from Udine Campoformido rawinsondes, while on the right it was derived from Milano Linate rawinsondes. Variable PCS_L10m ShowIu WBZm PCS_L10_M8m ∆PCS_M7 ∆KI <VE > 0.340 0.298 0.201 0.198 0.194 0.066 < TE > 0.338 0.277 0.175 0.173 0.125 0.046 < T otal E > 0.339 0.282 0.181 0.180 0.141 0.051 Table 1.5: Forward selection algorithm: results for the best IASI derived predictors. 22 Ref.: PA/IIS/FR/2010/01 Figure 1.18: The Training-Validation (TV) diagram of the CEEs computed over the 12 bootstraps (in stances of training/validation sets) for the first 6 “best variables” chosen by the classification forward selection algorithm. The sets of 12 TV points of each variable are represented alternatively by filled cir cles and triangles, while the unfilled squares, connected by a dashed line, show the mean errors over the 12-point bootstraps. 23 Ref.: PA/IIS/FR/2010/01 1.3.3 Results The Full data set (IASI-Lightnings) was divided into a Total set (which included 116 cases and was used to train the difference ANN candidates) and a Test Set (of 38 cases) to select the best prediction system, in terms of absolute scores and consistency between the results obtained on the Total and the Test sets. The architecture chosen was an ANN with 3 inputs, 1 neuron on the hidden layer, and 1 output. 1.3.3.1 Training Application of the ANN on the Total set led to a Total CEE of 0.27 (on 116 cases), while applying the probability threshold (0.40) on the continuous ANN output led to the following contingency table: TOTAL Forecast: YES Forecast: NO Event: YES 25 11 Event: NO 6 74 The analysis of the of the contingency table led to the following results: 1.3.3.2 TOTAL POD HIT FAR POFD BIAS TS HSS PSS S(PSS) ODDS Score 0.81 0.85 0.31 0.13 1.16 0.59 0.64 0.68 0.08 28.03 Testing Applying the ANN on the Test set led to a Test CEE of 0.81 (on 38 cases), while entries found in the contingency table were: TEST Forecast: YES Forecast: NO Event: YES 7 7 Event: NO 7 17 The analysis of the of the contingency table led to the following results: TEST POD HIT FAR POFD BIAS TS HSS PSS S(PSS) ODDS Score 0.5 0.63 0.5 0.29 1 0.33 0.21 0.21 0.16 2.43 which are worse than the scores obtained applying the ANN on the Total Set, and indicate a poor gen eralization capacity of the network. The ROC curves for this ANN are shown in figure 1.19. It is worth emphasizing that, in this case, signs of overfitting are evident, and the performance on the Reduced Test dataset are worse than those obtained on the Full rawinsonde Test dataset. 24 Ref.: PA/IIS/FR/2010/01 Figure 1.19: ROC obtained by selected ANN for Total and Test IASI data 1.3.4 Discussion of the results Poor results obtained in forcasting the Test convective events from IASI data and products, with respect to those obtained using rawinsondes, is likely due to: 1. the limited size of the IASI database (available retrievals in clear sky conditions) which is a factor 10 smaller than the rawinsonde database (both for training and testing); 2. a general tendency of the retrievals to overestimate low level water vapor, which led to over-estimation of the atmospheric instability. The first hypothesis was investigated and the results are reported in the two following subsections. Ad dressing the second hypothesis is part of the future work as it requires a more detailed study on the inversion system performances. 1.3.4.1 Limited size of the IASI Full data set Considering that the nowcasting of the convective event using rawinsonde derived predictors was signifi cantly better, a simple set of experiments was done to address the relevance of the Full data set size on the final skill figures. The experiments consisted essentially in replicating what is described in sec. 1.2, but using a reduced version of the rawinsonde Total and Test data sets, obtained by retaining form the Full rawinsonde data set only the cases represented in the Full IASI data set. 25 Ref.: PA/IIS/FR/2010/01 1. In the first experiment a new set of optimal predictors was determined using the Reduced Set of rawinsondes (150 case in total). The best ANN-inputs, showed in figure 1.20, were LIu, MELm, HLVv, SWEATu, Helm, HDu. The best ANN architecture, using only the first 3 predictors, 1 hidden neuron, led to a Total CEE of 0.37 and a Test CEE of 0.58 and to the following contingency tables on the Total and Test: RED. TOTAL Forecast: YES Forecast: NO Event: YES 28 25 Event: NO 1 59 RED. TEST Forecast: YES Forecast: NO Event: YES 13 12 1 11 Event: NO The analysis of the of the contingency table led to the following results: RED. TOTAL POD HIT FAR POFD BIAS TS HSS PSS S(PSS) ODDS Score 0.97 0.77 0.47 0.30 1.83 0.52 0.53 0.67 0.07 66.08 RED. TEST POD HIT FAR POFD BIAS TS HSS PSS S(PSS) ODDS Score 0.93 0.65 0.48 0.52 1.79 0.5 0.35 0.41 0.16 11.92 and to the ROC showed in figure 1.21. 2. In the second experiment the ANN described in the previous point and developed on the Reduced rawinsonde Total Set was applied to the Full rawinsonde Test Set. This led to a CEE of 0.55 on 415 cases while applying the probability threshold (0.4) on the continuous ANN output, and to the following contingency table on the Test: FULL TEST Forecast: YES Forecast: NO Event: YES 146 93 Event: NO 16 160 The analysis of the of the contingency table led to the following results: FULL TEST POD HIT FAR POFD BIAS TS HSS PSS S(PSS) ODDS Score 0.74 0.74 0.39 0.37 1.47 0.57 0.49 0.53 0.04 15.70 3. In the third experiment the ANN described in sec. 1.2 and developed on the Full rawinsonde set (8 inputs, 2 hidden neurons, 1 output) was applied on the Reduced rawinsonde Total and Test sets. This led to a CEE values of 0.29 and 0.27 respectively, while applying the probability threshold (0.4) on the continuous ANN output, and to the following contingency table on the Test: RED TEST Forecast: YES Forecast: NO Event: YES 11 0 Event: NO 3 21 The analysis of the of the contingency table led to the following results: 26 Ref.: PA/IIS/FR/2010/01 Figure 1.20: The TV CEEs computed over the 12 bootstraps (instances of training/validation sets) for the first 6 “best ANN-inputs” chosen by the classification forward selection algorithm. The sets of 12 TV points of each variable are represented alternatively by filled circles and triangles, while the unfilled squares, connected by a dashed line, show the mean errors over the 12-point bootstraps. RED TEST POD HIT FAR POFD BIAS TS HSS PSS S(PSS) ODDS Score 0.79 0.91 0.0 0.0 0.79 0.79 0.81 0.79 0.12 N/A Note that these results are coherent with those obtained by the 8-input 2-neurons ANN on its Full Test dataset and are independent of the very small sample size of the Reduced Test database used here. Outcomes of the 3 experiments were expected, but it was important to quantify the degradation of the performances due to the limited size of the total datasets. The first experiment showed that, also for the rawinsondes, if the Full data Set is small (and therefore the Total and Test sets are small) the ANN does a good job fitting the Total set but performs poorly on the Test set. The second experiment showed that part of the degradation of the performances of the ANN trained on the small Total data set is actually due to the lack of representativeness of the Reduced Test set, in fact, when a larger Full Test set is used the PSS improves noticeably. Finally the third experiment confirmed that the representativeness of Test set is not crucial, in fact the same ANN which was performing well on the Full Test set, when applied to the Reduced Test Set, produced consisten scores like a P SS = 0.79. 27 Ref.: PA/IIS/FR/2010/01 Figure 1.21: ROC obtained by selected ANN for the reduced Total and Test rawinsonde data 1.3.4.2 Increasing the Full IASI data set By focusing on the a single area of interest, it was possible to generate an ANN predictor, trained on a IASI datasets twice as large as the Full IASI Set used in sec. 1.3. This section aims to provide a short description of the results obtained by nowcasting convective events over the area of Milano only. The goal is to prove that by avoiding the need for coincident observations, more observations could be used in the experiment, and better results were achieved using IASI products. A more extensive study of the single area prediction scheme is left to future work. 1. MILANO: a new set of optimal predictors was determined using the Full IASI Set of retrievals for Milano (331 case in total). The best ANN-inputs, showed in figure 1.22, were KI, PCS LW 12, PCS LW 17. The best ANN, a 3 input, 1 hidden neuron, led to a Total CEE of 0.40 (on 242 cases) and a Test CEE of 0.42 (on 89 cases), and to the following contingency tables on the Total and Test: MILANO TOTAL Forecast: YES Forecast: NO Event: YES 36 41 Event: NO 25 140 MILANO TEST Forecast: YES Forecast: NO Event: YES 17 13 6 53 Event: NO The analysis of the of the contingency tables led to the following results: 28 Ref.: PA/IIS/FR/2010/01 Figure 1.22: The TV CEEs computed over the 12 bootstraps (instances of training/validation sets) for the first 6 “best ANN-inputs” chosen by the classification forward selection algorithm. The sets of 12 TV points of each variable are represented alternatively by filled circles and triangles, while the unfilled squares, connected by a dashed line, show the mean errors over the 12-point bootstraps. MILANO TOTAL POD HIT FAR POFD BIAS TS HSS PSS S(PSS) ODDS Score 0.59 0.73 0.53 0.23 1.26 0.35 0.33 0.36 0.06 4.91 MILANO TEST POD HIT FAR POFD BIAS TS HSS PSS S(PSS) ODDS Score 0.74 0.79 0.43 0.2 1.30 0.47 0.35 0.49 0.10 11.55 and to the ROC showed in figure 1.23. Results obtained over Milano, using a Full dataset which is twice as large as the Milano-Udine Full dataset provide much better results than those obtained for the whole Po Valley, and indicates that nowcasting of convection by IASI data over individual (smaller) areas is feasible and promising. 1.4 Conclusions This report describes the results obtained by two forecast systems for thunderstorms (events with more than 10 lightning strikes within 11:00 and 17:00 UTC for the time period April - October) over the Po Valley. 1. For the first system, rawinsondes launched in Milano Linate, and Udine Campoformido between 2004-2010 were used to produce sets of 50 instability indices. Among these indices 8 predictors (DTCu, SWISSm, WBZm, KIu, Mixu, PWCm, ShowIm, KIm, where m stands for calculated over 29 Ref.: PA/IIS/FR/2010/01 Figure 1.23: ROC obtained by selected ANN for the Full Total and Test IASI data over Milano Milano, and u for calculated over Udine) were fed to an ANN with 2 neurons in the hidden layer, and 1 continuous output. Application of the network to the Total (or Training) and Test sets led to CEE di 0.335 and 0.375 respectively. By setting the discretization threshold (event: YES/NO) to 0.40 the ANN produced two contingency tables for Total and Test associated to PSS scores of 0.667 and 0.677 respectively. 2. The second system was designed to replicate the first one but with two substantial differences: wind dependent instability indices were not used; and 20 PCS were used as potential predictors. The available dataset for the IASI observations turned out to be much smaller than the rawinsonde database, because only a small fraction (about 30% per area of interest) of IASI observations were found to be in clear sky, and because the required coincidence of retrievals over the two areas reduced the available cases to 10% of the total cases. With the limited IASI dataset, it was possible to describe the Total set properly, with a 3 input ANN (PCl10m, ShowIu, and WBZm) with 1 neuron on the hidden layer, but the capacity of the ANN to generalize on the Test Set was found poor: the CEE=0.267 on the Total Set, becomes 0.813 on the Test Set. With a discrimination threshold of 0.40 the PSS= 0.722 on the Total set became 0.333 on the Test set. It is worth emphasizing that the first predictors chosen were a combination of PCS and instability indices. Poor results obtained in the generalization of the prediction of convective event from IASI data and products, were found to be mostly dependent on the limited size of the IASI database (available retrievals in clear sky conditions) which is a factor 10 smaller than the rawinsonde database (both for training and 30 Ref.: PA/IIS/FR/2010/01 testing). However a general tendency of the retrievals to overestimate low level water vapor, which led to overestimation of the atmospheric instability, was found and should be further investigated. Finally by focusing on a single area of interest we were able to increase the size of the IASI database by a factor two, and the prediction PSS score on the test set reached the value of 0.57, indicating that nowcasting of convection by IASI data over individual (smaller) areas, even if not yet as good as the nowcasting from high vertical resolution rawinsondes, is feasible and promising, especially in the perspective of using more polar satellite (AQUA, MATOP-B, etc) and even more with future geostationary platfroms (MTG). Besides the final scores, significance of the presented material relies on the correlation found between some of the PCS and the occurrence of convection, and on the validation of the IASI level2 and level 3 products. 31 Chapter 2 Technical Report 5: validation of Level 3 products derived from vertical rawinsonde and retrieval profiles with occurrence of convection as detected by Lightnings. Document: Technical Report 5 Written by: Paolo Antonelli, A. Manzato Date: 26 November 2010 Reference: PA/IIS/TR05/2010/01 2.1 Relationship between Instability Indices and convection oc currence This document describes the statistical links found between the instability indices derived from raw insondes and IASI retrievals over the areas of Udine Campoformido, Pratica di Mare, and Cagliari (sec. 2.2 PA/IIS/TR03/2010/01) with the occurrence of convective activity as detected by lightnings (PA/IIS/TR01/2010/01). Statistical relationships were evaluated in different way as described in the fol lowing sections as a first step in investigating and comparing the skills of individual indices in predicting the occurrence of convection. Following in part the concept described in [9], the three different approaches used are based on linear regression (sec. 2.1.1), cross-entropy (sec. 2.1.2), and skill scores (sec. 2.1.3). It is worth stressing that in the first and the third case the results are strongly dependend on the threshold used to map the continuous instability variables into a boolean index (Istable = 0; Iunstable = 1). Also the relationship described in this document refer to individual indices. This work has been done to demonstrate the need for a statistical tool capable of combining all the indices to take advantage of their individual skills. 32 Ref.: PA/IIS/FR/2010/01 2.1.1 Linear correlation between instability binary indices and convection oc currence In a first approach the linear correlation between indices and occurrence of convection was done mapping the continuous values of the indices and of the lightning counts into two boolean variables. For every rawinsonde, followed by more than 1 lightning count in the 10 hrs time span, the convection occurrence variable L was set to yes (1). If no lightning activity was observed the convection occurrence variable, L, was se to no (0). The continuous values relative to the instability indices were also mapped into boolean variables, by defining, for each index, a threshold value, t, which if exceeded (not exceeded) by the index would lead to an instability variable It equal to yes (1) or no (0). For each index, derived from the rawinsonde or from the retrievals, the threshold values, t, was determined by maximazing the correlation between It and L. Once thresholds were defined individual the linear correlation R was calculated as follows: R = cov (L, It ) σL σIt (2.1) 2.1.2 Cross-entropy error between instability indices and convection occur rence In the classification problem the error is usually taken to be the cross-entropy error (CEE), defined as: N 1 � (Lj ln (x) + (1 − Lj ) ln (1 − x)) CEE = − N j=1 (2.2) where N is the total number of cases, Lj is a boolean representing the convection occurrence, and x is the instabilityindex under consideration. 2.1.3 Skill scores As last step, contingency tables for each Instability index were derived using the It which maximize R in equation 2.1. An example of the contingency table is shown in tbl. 2.1 where a and b represent the number of cases in which there was instability according to the value of the instability index considered (for example CAP E > It = 500) and lightning activity was observed in the 10 hrs following the rawinsonde/satellite observation (a), or lightning activity was not observed (b); c and d represent cases in which the indices indicated stability (for example CAP E < It = 500) and lightning activity was observed (c), or was not observed (d). Contingency tables were derived for both rawinsondes and retrievals and for each individual index. For each contingency tables a set of 5 scores (tbl. were calculated and compared. 33 Ref.: PA/IIS/FR/2010/01 INST (Y) INST (N) LGT (Y) a c LGT (N) b d Table 2.1: Example of contingency table Expression POD POFD FAR HIT a a+c b b+d b a+b a+d a+b+c+d PSS (ad-bc)/[(a+c)(b+d) Table 2.2: Scores 2.2 Results 2.2.1 Udine Campoformido: Linear correlation between instability binary in dices and convection occurrence and cross-entropy Using the Instability Indices derived from 1924 rawinsonde and 148 retrievals, threshold values, It , needed to map the continuous indices into a stability boolean index were derived by maximazing the linear correlation as described in section 2.1.1. Figures 2.3, 2.4, and 2.5 show the linear correlation R as function of the threshold values It , for both the rawinsondes (magenta) and the retrievals (black). Out of the 1924 rawinsonde available 628 were found to be associated to lightning activity in the 10 hrs following the rawinsonde launch, while out of the 148 available retrievals, 19 of them were associated to lightning activity. Values of the maximum linear correlation found for both rawinsonde and retrievals are shown in tbl. 2.6 (5th and 6th columns). While the values of cross-entropy are shown in the same table in the 7th and 8th columns. The Group column represent the index family as described in document PA/IIS/TR04/2010/01. 2.2.2 Udine Campoformido: skill scores Skill scores described in section 2.1.3, shown in tbl. 2.7, were calculated from rawinsonde (columns 2, 4, 6, 8, and 10) and from retrievals (columns 3, 5, 7, 9, and 11). It is important to emphasize that the scores are strongly dependent on the threshold values, It , used to map the continuous indices into boolean for stability/instability, which were simply derived by maximizing the linear correlation. 2.2.3 Pratica di Mare: Linear correlation between instability binary indices and convection occurrence and cross-entropy Using the Instability Indices derived from 1649 (574 associated to lightning activity) rawinsonde and 291 (64 associated to lightning activity) retrievals, threshold values, It , needed to map the continuous indices into a stability boolean index were derived by maximazing the linear correlation as described in section 2.1.1. Figures 2.8, 2.9, and 2.10 show the linear correlation R as function of the threshold values It , for both the rawinsondes (magenta) and the retrievals (black). Values of the maximum linear correlation found for both rawinsonde and retrievals are shown in tbl. 2.11 (5th and 6th columns). While the values 34 Ref.: PA/IIS/FR/2010/01 Table 2.3: Udine, Campoformido: correlation between instability occurrence (as a function of threshold values) and lightning occurrence for CAPE, CIN, UpDr, LI, ShowI, DTC, DT500, and LCL 35 Ref.: PA/IIS/FR/2010/01 Table 2.4: Udine, Campoformido: correlation between instability occurrence (as a function of threshold values) and lightning occurrence for Tbase, MaxBuo, CAP, MRH, PWE, LRH, KI, Θe . 36 Ref.: PA/IIS/FR/2010/01 Table 2.5: Udine, Campoformido: correlation between instability occurrence (as a function of threshold values) and lightning occurrence for LFC. Index Group Units CAPE CIN UdDr LI ShowI DTC DTC500 LFC LCL Tbase MaxBuo KI CAP MRH LRH PWE Θe magenta magenta magenta grey grey grey grey grey grey grey grey green green green green green green J/kg J/kg m/s ◦ C ◦ C ◦ C ◦ C m m ◦ C ◦ C ◦ C ◦ C % % mm K Sonde R 0.51 0.48 0.53 -0.53 -0.54 -0.51 -0.55 0.13 -0.36 0.43 0.51 0.49 -0.28 0.27 0.21 0.44 0.40 IASI R 0.46 0.32 0.45 -0.41 -0.43 -0.47 -0.44 0.26 -0.23 0.30 0.50 0.39 -0.29 0.28 0.21 0.37 0.35 Sonde CEE 1.60 1.12 1.84 0.95 0.94 1.02 0.93 0.98 0.87 0.94 0.53 0.87 0.96 0.72 0.84 0.53 0.58 IASI CEE 0.26 1.16 0.31 1.38 1.38 1.31 1.38 0.61 0.51 1.19 0.58 0.78 0.50 0.64 0.94 0.50 0.69 Table 2.6: Udine Campoformido. List of correlation values between Instability Indices derived from1924 (628 with lightning) rawinsonde and 148 (19 with lightning) ([15]) and number of lightning counts observed in the 10 hrs time span after rawinsonde launch. 37 Ref.: PA/IIS/FR/2010/01 CAPE CIN UdDr LI ShowI DTC DTC500 LFC LCL Tbase MaxBuo KI CAP MRH LRH PWE Θe Sonde POD 0.73 0.92 0.77 0.25 0.20 0.32 0.15 0.27 0.14 0.76 0.79 0.72 0.37 0.84 0.83 0.60 0.70 IASI POD 0.79 0.89 0.47 0.11 0.42 0.26 0.16 0.79 0.00 0.89 0.53 0.74 0.37 0.74 0.05 0.68 0.53 Sonde IASI Sonde POFD POFD FAR 0.20 0.19 0.36 0.42 0.41 0.48 0.21 0.05 0.36 0.80 0.70 0.87 0.76 0.90 0.89 0.83 0.85 0.85 0.74 0.77 0.91 0.16 0.40 0.55 0.51 0.31 0.88 0.31 0.44 0.45 0.25 0.05 0.40 0.21 0.21 0.38 0.67 0.76 0.79 0.56 0.33 0.58 0.63 0.00 0.61 0.17 0.20 0.37 0.28 0.12 0.45 IASI FAR 0.62 0.76 0.44 0.98 0.94 0.96 0.97 0.78 1.00 0.77 0.41 0.66 0.93 0.75 0.00 0.67 0.62 Sonde HIT 0.78 0.69 0.78 0.22 0.22 0.22 0.23 0.65 0.38 0.72 0.76 0.77 0.34 0.57 0.52 0.76 0.72 IASI HIT 0.81 0.63 0.89 0.28 0.14 0.16 0.22 0.62 0.60 0.60 0.89 0.78 0.26 0.68 0.88 0.78 0.83 Sonde PSS 0.53 0.50 0.55 -0.54 -0.56 -0.52 -0.58 0.11 -0.37 0.45 0.54 0.51 -0.30 0.28 0.20 0.43 0.42 IASI PSS 0.60 0.48 0.42 -0.59 -0.48 -0.59 -0.61 0.39 -0.31 0.45 0.47 0.53 -0.39 0.41 0.05 0.48 0.40 Table 2.7: Udine Campoformido. List of contingency table scores for Instability Indices derived from 1924 (628 with lightning) rawinsonde and 148 (19 with lightning) retrievals ([15]) and number of lightning counts observed in the 10 hrs time span after rawinsonde launch. of cross-entropy are shown in the same table in the 7th and 8th columns. The Group column represent the index family as described in document PA/IIS/TR04/2010/01. 2.2.4 Pratica di Mare: skill scores Skill scores described in section 2.1.3, shown in tbl. 2.12, were calculated from rawinsonde (columns 2, 4, 6, 8, and 10) and from retrievals (columns 3, 5, 7, 9, and 11). It is important to emphasize that the scores are strongly dependent on the threshold values, It , used to map the continuous indices into boolean for stability/instability, which were simply derived by maximizing the linear correlation. 2.2.5 Cagliari: Linear correlation between instability binary indices and con vection occurrence and cross-entropy Using the Instability Indices derived from 1725 (381 with lightning) rawinsonde and 292 (27 with lightning) retrievals retrievals, threshold values, It , needed to map the continuous indices into a stability boolean index were derived by maximazing the linear correlation as described in section 2.1.1. Figures 2.13, 2.14, and 2.15 show the linear correlation R as function of the threshold values It , for both the rawinsondes (magenta) and the retrievals (black). Values of the maximum linear correlation found for both rawinsonde and retrievals are shown in tbl. 2.16 (5th and 6th columns). While the values of cross-entropy are shown in the same table in the 7th and 8th columns. The Group column represent the index family as described in document 38 Ref.: PA/IIS/FR/2010/01 Table 2.8: Pratica di Mare: correlation between instability occurrence (as a function of threshold values) and lightning occurrence for CAPE, CIN, UpDr, LI, ShowI, DTC, DT500, and LCL 39 Ref.: PA/IIS/FR/2010/01 Table 2.9: Pratica di Mare: correlation between instability occurrence (as a function of threshold values) and lightning occurrence for Tbase, MaxBuo, CAP, MRH, PWE, LRH, KI, Θe . 40 Ref.: PA/IIS/FR/2010/01 Table 2.10: Pratica di Mare: correlation between instability occurrence (as a function of threshold values) and lightning occurrence for LFC. Index Group Units CAPE CIN UdDr LI ShowI DTC DTC500 LFC LCL Tbase MaxBuo KI CAP MRH LRH PWE Θe magenta magenta magenta grey grey grey grey grey grey grey grey green green green green green green J/kg J/kg m/s ◦ C ◦ C ◦ C ◦ C m m ◦ C ◦ C ◦ C ◦ C % % mm K Sonde R 0.33 0.33 0.38 -0.35 -0.39 -0.40 -0.35 -0.13 -0.21 0.13 0.41 0.41 -0.29 0.39 0.30 0.24 0.07 IASI R 0.30 0.27 0.32 -0.28 -0.25 -0.29 -0.29 -0.10 -0.23 0.19 0.31 0.14 -0.25 0.25 0.24 0.15 -0.06 Sonde CEE 2.24 1.15 2.52 0.84 0.84 1.01 0.82 1.07 0.95 1.17 0.60 1.00 0.99 0.59 0.68 0.64 0.76 IASI CEE 0.88 1.04 0.80 0.77 0.85 0.85 0.82 1.04 0.63 1.54 0.73 0.91 0.44 0.59 0.76 0.70 0.96 Table 2.11: Pratica di Mare. List of correlation values between Instability Indices derived from 1649 (574 with lightning) rawinsonde and 291 (64 with lightning) retrievals ([15]) and number of lightning counts observed in the 10 hrs time span after rawinsonde launch. 41 Ref.: PA/IIS/FR/2010/01 CAPE CIN UdDr LI ShowI DTC DTC500 LFC LCL Tbase MaxBuo KI CAP MRH LRH PWE Θe Sonde POD 0.64 0.91 0.62 0.37 0.37 0.14 0.18 0.02 0.15 0.82 0.89 0.58 0.37 0.82 0.57 0.29 0.96 IASI POD 0.56 0.75 0.75 0.50 0.28 0.52 0.45 0.09 0.23 0.28 0.52 0.98 0.36 0.52 0.70 0.17 0.44 Sonde IASI Sonde POFD POFD FAR 0.30 0.23 0.47 0.59 0.43 0.55 0.24 0.37 0.42 0.73 0.80 0.79 0.77 0.58 0.79 0.55 0.82 0.88 0.54 0.78 0.85 0.09 0.19 0.89 0.34 0.51 0.81 0.70 0.11 0.62 0.47 0.19 0.50 0.18 0.89 0.36 0.67 0.66 0.77 0.42 0.24 0.49 0.26 0.41 0.46 0.10 0.07 0.39 0.92 0.51 0.64 IASI FAR 0.59 0.67 0.63 0.85 0.88 0.85 0.86 0.88 0.89 0.59 0.57 0.76 0.87 0.62 0.67 0.58 0.80 Sonde HIT 0.68 0.58 0.71 0.30 0.28 0.34 0.36 0.60 0.48 0.48 0.65 0.74 0.34 0.67 0.68 0.69 0.38 IASI HIT 0.73 0.61 0.66 0.27 0.39 0.25 0.27 0.66 0.43 0.75 0.75 0.31 0.35 0.71 0.62 0.77 0.48 Sonde PSS 0.34 0.31 0.38 -0.36 -0.40 -0.41 -0.36 -0.07 -0.20 0.12 0.42 0.40 -0.31 0.41 0.31 0.19 0.04 IASI PSS 0.33 0.32 0.38 -0.30 -0.30 -0.30 -0.32 -0.09 -0.28 0.17 0.33 0.10 -0.30 0.28 0.29 0.11 -0.07 Table 2.12: Pratica di Mare. List of contingency table scores for Instability Indices derived from1649 (574) rawinsonde and 291 (64) retrievals ([15]) and number of lightning counts observed in the 10 hrs time span after rawinsonde launch. PA/IIS/TR04/2010/01. 2.2.6 Cagliari: skill scores Skill scores described in section 2.1.3, shown in tbl. 2.17, were calculated from rawinsonde (columns 2, 4, 6, 8, and 10) and from retrievals (columns 3, 5, 7, 9, and 11). It is important to emphasize that the scores are strongly dependent on the threshold values, It , used to map the continuous indices into boolean for stability/instability, which were simply derived by maximizing the linear correlation. 2.3 Analysis of Results Results showed that different statistical analysis produce different ranking of the forecasting skill of the indices. In particular, over Udine Campoformido, the approach based on Max(R) (sec. 2.1.1) is more favorable to UpDr, LI, ShowI and DTC500, which show sharp peaks in figures 2.3, 2.4, and2.5. While the second method based on and Min(CEE) (sec. 2.1.2) is favorable to MaxBuo, PWE e Θe , where the last two exhibit a wider maximum rather than a sharp one. Differences in the ranking should be further investigated. Over Pratica di Mare, the approach based on Max(R) rewards DTC, MaxBuo and KI, while the second approach, M in(CEE), M axBuo, PWE e MRH. Over Cagliari M ax(R) and M in(CEE) reward both MaxBuo e MRH. Worst index according to M in(CEE) is consistently UpDr while for M ax(R) is always 42 Ref.: PA/IIS/FR/2010/01 Table 2.13: Cagliari: correlation between instability occurrence (as a function of threshold values) and lightning occurrence for CAPE, CIN, UpDr, LI, ShowI, DTC, DT500, and LCL 43 Ref.: PA/IIS/FR/2010/01 Table 2.14: Cagliari: correlation between instability occurrence (as a function of threshold values) and lightning occurrence for Tbase, MaxBuo, CAP, MRH, PWE, LRH, KI, Θe . 44 Ref.: PA/IIS/FR/2010/01 Table 2.15: Cagliari: correlation between instability occurrence (as a function of threshold values) and lightning occurrence for LFC. Index Group Units CAPE CIN UdDr LI ShowI DTC DTC500 LFC LCL Tbase MaxBuo KI CAP MRH LRH PWE Θe magenta magenta magenta grey grey grey grey grey grey grey grey green green green green green green J/kg J/kg m/s ◦ C ◦ C ◦ C ◦ C m m ◦ C ◦ C ◦ C ◦ C % % mm K Sonde R 0.30 0.35 0.30 -0.30 -0.35 -0.36 -0.28 -0.07 -0.15 0.11 0.40 0.35 -0.28 0.41 0.30 0.14 -0.11 IASI R 0.16 0.21 0.18 -0.23 -0.14 -0.19 -0.18 -0.12 -0.21 0.12 0.18 0.21 -0.19 0.29 0.25 -0.18 -0.18 Sonde CEE 1.56 1.28 1.98 0.78 0.90 0.95 0.79 0.83 0.69 1.38 0.58 1.15 0.77 0.55 0.69 0.65 0.73 IASI CEE 0.30 0.61 0.41 0.89 0.94 1.16 0.88 0.71 0.50 1.42 0.45 0.84 0.26 0.68 0.92 0.79 0.68 Table 2.16: Cagliari. List of correlation values between Instability Indices derived from 1725 (381 with lightning) rawinsonde and 292 (27 with lightning) retrievals ([15]) and number of lightning counts observed in the 10 hrs time span after rawinsonde launch. 45 Ref.: PA/IIS/FR/2010/01 CAPE CIN UdDr LI ShowI DTC DTC500 LFC LCL Tbase MaxBuo KI CAP MRH LRH PWE Θe Sonde POD 0.87 0.82 0.62 0.18 0.18 0.13 0.24 0.06 0.15 0.99 0.93 0.79 0.68 0.82 0.76 0.11 0.85 IASI POD 0.85 0.81 0.81 0.56 0.07 0.19 0.96 0.11 0.19 0.67 0.63 0.26 0.15 0.63 0.89 0.96 0.11 Sonde IASI Sonde POFD POFD FAR 0.51 0.58 0.67 0.39 0.45 0.63 0.27 0.51 0.61 0.54 0.86 0.91 0.60 0.28 0.92 0.57 0.51 0.94 0.58 1.00 0.90 0.11 0.29 0.86 0.31 0.54 0.88 0.94 0.46 0.77 0.46 0.34 0.63 0.37 0.06 0.62 0.91 0.47 0.83 0.33 0.20 0.59 0.40 0.46 0.65 0.03 1.00 0.53 0.93 0.41 0.79 IASI FAR 0.87 0.84 0.86 0.94 0.97 0.96 0.91 0.96 0.97 0.87 0.84 0.70 0.97 0.76 0.84 0.91 0.97 Sonde HIT 0.57 0.65 0.70 0.40 0.35 0.37 0.38 0.71 0.57 0.27 0.63 0.67 0.22 0.70 0.63 0.78 0.25 IASI HIT 0.46 0.58 0.52 0.18 0.66 0.46 0.09 0.65 0.43 0.55 0.66 0.88 0.50 0.78 0.57 0.09 0.54 Sonde PSS 0.36 0.43 0.35 -0.36 -0.42 -0.43 -0.34 -0.05 -0.16 0.06 0.48 0.42 -0.23 0.48 0.36 0.07 -0.08 IASI PSS 0.27 0.37 0.30 -0.30 -0.21 -0.32 -0.04 -0.18 -0.35 0.21 0.29 0.20 -0.32 0.43 0.43 -0.04 -0.30 Table 2.17: Cagliari. List of contingency table scores for Instability Indices derived from 1725 (381 with lightning) rawinsonde and 292 (27 with lightning) retrievals ([15]) and number of lightning counts observed in the 10 hrs time span after rawinsonde launch. LFC. Before drawing the conclusions it is important to note that for the rawinsonde the probability of a 628 19 convective event (for example over Udine) is praw = 1924 = 33% , while for the retrievals pret = 148 = 13% . Reason for the large discrepancy, being praw � pret , is likely due to the higher probability of having clouds associated to convective events. 2.4 Conclusions This document described the results obtained by investigating the statistical links between the instability indices derived from rawinsondes and IASI retrievals over the areas of Udine Campoformido, Pratica di Mare, and Cagliari, with the occurrence of convective activity as detected by lightnings. Statistical relationships were evaluated in different ways as a first step in understanding and comparing the skills of individual indices in predicting the occurrence of convection. Following in part the concept described in [9], the three different approaches used were based on linear regression (sec. 2.1.1), cross-entropy (sec. 2.1.2), and skill scores (sec. 2.1.3). It is worth stressing that in the first and the third case the results found were strongly dependend on the threshold used to map the continuous instability variables into a boolean index (Istable = 0; Iunstable = 1). As expected single indices have different skills in forecasting convection activity, therefore the development of a statistical tool capable of combining all the indices to take advantage of their individual skills is highly recommended. In terms of comparing the information obtained from the rawinsonde and the retrievals, the results 46 Ref.: PA/IIS/FR/2010/01 showed in this document remain consistent with conclusions drawn in the previous report (PA/IIS/TR04/2010/01) which stated that overall the results obtained are encouraging in defining a procedure to operationally use the IASI retrievals in the derivation of instability indices useful for forecasting, and nowcasting of instability activity. 47 Chapter 3 Technical Report 4: Comparison of Level 3 products (Instability Indices) derived from satellite observations and rawinsondes Document: Technical Report 4 Written by: Paolo Antonelli Date: 4 November 2010 Reference: PA/IIS/TR04/2010/01 3.1 Generation of Instability Indices A set of eighteen instability indices was derived from both available rawinsondes (hereafter referred to as ΥR ) and from available IASI retrievals (hereafter referred to as ΥI ) . Instability indices, listed in table 3.1, were generate using the software Sound_Analys.py developed at the Osservatorio Meteorologico Regionale (OSMER) of Friuli Venezia Giulia, by Agostino Manzato [15]. Besides providing capabilities to derive a large set in instability indices, Sound_Analys.py, implements three different schemes for evaluating the buoyancy and performing the adiabatic lifting. The study described in this document uses only the results obtained by the Tv scheme which corresponds to the simple T scheme, where the parcel is considered to consist not only of dry air, neglecting therefore the water vapor, but also of the non-saturated vapor during the ‘‘dry’’ lifting. In other words the Tv scheme is based on a ‘‘moist’’ adiabatic lifting process. Specific detail about the three approaches can be found in [11]. The remaining part of this section describes the basic concepts of the lifted parcel theory, however, before proceeding further it is worth emphasizing that not all the indices rely on this theory, for example the K-index (KI) depend on the atmospheric state condition only, regardless of the assumption made on vertical dynamics schemes. 48 Ref.: PA/IIS/FR/2010/01 Index CAPE CIN UdDr LI ShowI DTC DTC500 LFC LCL Tbase KI CAP MRH LRH PWE Θe MaxBuo Swiss Units J/kg J/kg m/s ◦ C ◦ C ◦ C ◦ C m m ◦ C ◦ C ◦ C % % mm K ◦ C − Table 3.1: List of Instability Indices derived from rawinsonde and retrievals. Detailed description of the individual indices can be found in [15]. 3.1.1 Lifted Parcel Theory assumptions The assumptions behind the calculation of the instability indices dependent on the lifted parcel theory can be schematically synthesized as: • rising parcel does not mix with the environment; • parcel pressure always equal to environmental pressure at same height; • parcel rises along moist adiabat until it becomes saturated, and afterwards it rises along a wet adiabat; • condensed water falls out of the parcel (pseudo-adiabatic process), so there is no freezing of the condensed water. 3.1.2 Lifted Parcel Theory The implementation of the lifted parcel theory in Sound_Analys.py starts with the selection of the initial low level parcel that will raise, and that represents the moist and warm air that will create the cloud. Sound_Analys.py defines a layer of 30 hPa and raises it stepwise though the lowest 250 hPa, computing for each central layer pressure (pc ) the mean sounding values within the layer ([pc −15 , pc +15] hP a). The layer with the most unstable features, i.e. the largest pressure averaged equivalent potential temperature Θe0 is selected as initial parcel (Most Unstable Parcel). Initial conditions for the parcel are (p0 , T0 , T d0 ). The 49 Ref.: PA/IIS/FR/2010/01 parcel is then lifted to higher level (p, T, Td ), along a moist adiabat, so that initial potential temperature Θe0 and q0 are conserved until parcel becomes saturated at Lifting Condensation Level (LCL). Above LCL the parcel is lifted along a wet adiabat which conserves only the equivalent potential temperature Θe0 while q = qsat (T, p). If, at a level, the parcel becomes lighter than the environment, that is called Level of Free Convection (LFC) and the sounding is said to be potentially unstable. Afterwards the parcel rises up to Equilibrium Level (EL) where the density of the parcel is equal to the environmental density. 3.1.3 Selection of Instability Indices Manzato [10] shows as different instability indices are differently sensitive to the vertical resolution. Table 3.1 show the correlation, R2 , between the indices calculated from the same rawinsondes but sampled at different vertical resolution. The indices which are dependent on the lifted parcel theory show lower correlations than the other indices, therefore it is likely that they will present the largest deviation when calculated for the high vertical resolution rawinsondes rather than for the IASI retrievals. Therefore throughout the whole document the indices have been divided in three main families or groups: 1. indices heavily dependent on lifted parcel theory (CAPE, CIN, UpDr) hereafter referred to as magenta group; 2. indices somehow dependend to lifted parcel theory (LI, ShowI, DTC, DTC500, LFC, Tbase) and belonging to the grey group; 3. indices independent of lifted parcel theory (KI, CAP, MRH, LRH, Θe, MaxBuo, PWE) and belonging to the green group. 3.2 Results The instability indices listed in table 3.1 were generated from sets, ΥR , of rawinsondes, at highest available vertical resolution, and from sets ΥI of IASI retrievals. In order to minimize the impact of the issues described in technical report #3 (ref. PA/IIS/TR03/2010/01) the following procedure was used: 1. instability indices were derived for each satellite overpass with one or more IASI spectrally successful retrievals by applying Sound_Analys.py to the mean profiles obtained by all the available retrievals for that overpass; 2. the same instability indices were derived from the closest available rawinsonde, regardless of time order but with a time difference constraint of 200 min; 3. linear correlation between IASI derived, and rawinsonde derived indices was calculated for each site over the selected time period, 50 Ref.: PA/IIS/FR/2010/01 Figure 3.1: Table of correlation, from [10], between instability indices calculated from the same rawinsonde sampled at different vertical resolution from soundings at full vertical resolution compared with sounding reduced to the TEMP format (WMO code) . Values highlighted in green refer to the indices which are independent of Lifted Parcel Theory. Grey and magenta indices are partially and highly dependent on the Lifted Parcel Theory. 51 Ref.: PA/IIS/FR/2010/01 Index CAPE CIN UdDr LI ShowI DTC DTC500 LFC LCL Tbase MaxBuo KI CAP MRH LRH PWE Θe Swiss Group magenta magenta magenta grey grey grey grey grey grey grey grey green green green green green green N/A Units J/kg J/kg m/s ◦ C ◦ C ◦ C ◦ C m m ◦ C ◦ C ◦ C ◦ C % % mm K − R .44 .51 .45 .81 .80 .60 .79 .26 .61 .86 .58 .88 .22 .78 .67 .93 .94 N/A Table 3.2: List of Instability Indices derived from rawinsonde and retrievals. Detailed description of the individual indices can be found in [15]. 3.2.1 Udine Campoformido Available data over Udine Campoformido include: 2069 rawinsondes (as described in technical report #1, ref. PA/IIS/TR01/2010/01) for the time period of July 2007 and December 2009, and 702 spectrally successful retrievals associated to 217 METOP-A overpasses. Linear correlation coefficients calculated are listed in tbl. 3.2. 3.2.2 Pratica di Mare Available data over Pratica di Mare include: 1757 rawinsondes (as described in technical report #1, ref. PA/IIS/TR01/2010/01), and 1328 spectrally successful retrievals associated to 397 METOP-A overpasses. Linear correlation coefficients calculated are listed in tbl. 3.3. 3.2.3 Cagliari Available data over Pratica di Mare include: 1858 rawinsondes (as described in technical report #1, ref. PA/IIS/TR01/2010/01), and 922 spectrally successful retrievals associated to 348 METOP-A overpasses. Linear correlation coefficients calculated are listed in tbl. 3.4. 52 Ref.: PA/IIS/FR/2010/01 Index CAPE CIN UdDr LI ShowI DTC DTC500 LFC LCL Tbase MaxBuo KI CAP MRH LRH PWE Θe Swiss Group magenta magenta magenta grey grey grey grey grey grey grey grey green green green green green green N/A Units J/kg J/kg m/s ◦ C ◦ C ◦ C ◦ C m m ◦ C ◦ C ◦ C ◦ C % % mm K − R .27 .29 .31 .49 .57 .45 .57 .47 .37 .70 .40 .72 .51 .74 .74 .86 .91 N/A Table 3.3: List of Instability Indices derived from rawinsonde and retrievals. Detailed description of the individual indices can be found in [15]. Index CAPE CIN UdDr LI ShowI DTC DTC500 LFC LCL Tbase MaxBuo KI CAP MRH LRH PWE Θe Swiss Group magenta magenta magenta grey grey grey grey grey grey grey grey green green green green green green N/A Units J/kg J/kg m/s ◦ C ◦ C ◦ C ◦ C m m ◦ C ◦ C ◦ C ◦ C % % mm K − R .30 .32 .32 .35 .55 .48 .59 .45 .25 .58 ..40 .69 .56 .80 .78 .82 .88 N/A Table 3.4: List of Instability Indices derived from rawinsonde and retrievals. Detailed description of the individual indices can be found in [15]. 53 Ref.: PA/IIS/FR/2010/01 3.3 Analysis of Results A first look at the results indicates that the green indices calculated from rawinsonde and averaged re trievals show, as expected, higher correlation than grey and magenta indices, with the exception of CAP (green group), that rely on the very high vertical resolution details of the lowest 400hPa and shows a low correlation. In addition, results over Udine Campoformido show higher values of correlation than the results over Pratica di Mare and Cagliari. This was expected from the discussion of the retrieval validation included in technical report #3 (ref. PA/IIS/TR03/2010/01), which showed how retrievals over coastal areas were very likely more contaminated by clouds, and also how the averaging of continental/ocean profiles in regimes of breezes might not be an optimal approach. Besides presentation and analysis of the absolute values of the correlations, it was considered important to refer the results obtained to the correlation existing between indices derived from forecast profiles (at different time) and those derived from rawinsondes as described in [10]. The fundamental reason for this comparison is based on the fact that generally the rawinsonde is considered useful in operational now-casting activities more than for longer range forecasting activities, and for this reason, operational forecasters are more inclined to use instability indices derived from model forecast than from rawinsonde. To compare refer the results to what is used operationally provides a practical quality assessment criterium. 3.3.1 Forecast derived indices The comparison of indices derived from forecast and from rawinsonde was presented by Manzato [10] and it is shown in figure 3.2, where columns show the change in correlation (in terms of R2 ) between the indices derived from the rawinsondes and the indices derived from forecast at different times (from 24 hrs to 132 hrs in advance) for Udine Campoformido. Colored oval shapes show the results obtained for the correlation between indices derived from rawinsondes and from IASI retrievals over the same site. However before proceeding any further with the comparison it is worth noting that the correlations obtained over Udine Campoformido, Pratica di Mare, and Cagliari and described in sections 3.2.1, 3.2.2, and 3.2.3, are affected by two factors: 1) is the limited vertical resolution of the retrievals compared to the vertical resolution of the rawinsondes; and 2) the time difference between the satellite overpass and the rawinsonde launch, which on average is 116 min. While the first factor is really the potentially limiting factor in the use IASI retrievals for instability forecasting, and it was addressed in section 3.1.3, the second factor is a false error due only to the time difference and it has to be properly taken out of consideration for a proper comparison of rawinsonde-retrieval and rawinsonde-forecast indices correlations (see following section). 3.3.2 Time dependence of instability indices In order to define an upper bound of time variability of the instability indices, Sound_Analys.py was applied to a set of consecutive rawinsondes launched within 6 hrs at 06 : 00 and 12 : 00 U T C. Correlations found 54 Ref.: PA/IIS/FR/2010/01 Figure 3.2: Table of correlation (in themrs of R2 between instability indices calculated from forecast for different time (columns) and rawinsonde (from [10]) . Colored oval shapes show the results of the correlation between indices derived from IASI retrievals and from rawinsonde over Udine Campoformido. Values highlighted in green refer to the indices which are independent of Lifted Parcel Theory. Grey and magenta indices are partially and highly dependent on the Lifted Parcel Theory. 55 Ref.: PA/IIS/FR/2010/01 Index CAPE CIN UdDr LI ShowI DTC DTC500 LFC LCL Tbase MaxBuo KI CAP MRH LRH PWE Θe Swiss Group magenta magenta magenta grey grey grey grey grey grey grey grey green green green green green green N/A Units J/kg J/kg m/s ◦ C ◦ C ◦ C ◦ C m m ◦ C ◦ C ◦ C ◦ C % % mm K − R .67 .70 .75 .91 .89 .81 .87 .70 .74 .90 .79 .87 .39 .81 .86 .95 .97 N/A Table 3.5: List of Instability Indices derived from 267 pairs of consecutive rawinsondes launched at 06 : 00 and 12 : 00 U T C over Udine Campoformido. Detailed description of the individual indices can be found in [15]. are listed in table 3.5, and showed in figure 3.3, in which red rectangles indicating the correlation values found for consecutive rawinsondes are superimposed to the values displayed in figure 3.2. Considering the errors due to different vertical resolutions between retrievals and rawinsondes (as shown in figure 3.1) and the errors due to time differences between satellite overpasses and rawinsonde launches of 116 min on average (whose upper bounds are shown in figure 3.3 for 360 min delay) it becomes clear that for several indices a large fraction of the correlation degradation between retrievals and rawinsondes is actually due to the time differences. 3.4 Conclusions The study described by this technical report indicates that instability indices independent, and weakly dependent on Lifted Parcel Theory, tend to correlate well between retrievals and rawinsonde. Correlation values, once the time difference have been accounted for, are comparable with the correlation found between indices derived from rawinsondes and forecast profiles. It is important to stress that: • at the report submission time, the retrievals used were labled as spectrally successful but were quality control for potential cloud contamination, therefore the correlation over Udine were found to be higher than those over Pratica di Mare and Cagliari (as described in technical report #3, ref. PA/IIS/TR03/2010/01); 56 Ref.: PA/IIS/FR/2010/01 Figure 3.3: Table of correlation (in themrs of R2 between instability indices calculated from forecast for different time (columns) and rawinsonde (from [10]). Colored oval shapes show the results of the correlation between indices derived from IASI retrievals and from rawinsonde over Udine Campoformido. Values highlighted in green refer to the indices which are independent of Lifted Parcel Theory. Grey and magenta indices are partially and highly dependent on the Lifted Parcel Theory. Red rectangles show the correlation between indices derived for a set of consecutive rawinsondes launched within 6 hrs. 57 Ref.: PA/IIS/FR/2010/01 • results over the three sites could be improved by performing a suitable cloud quality control; • the availability of extra rawinsonde time and space coincident with IASI observations would greatly improved the significance of the presented work. Overall the results obtained are encouraging in defining a procedure to operationally use the IASI retrievals in the derivation of instability indices useful for forecasting, and nowcasting of instability activity. 58 Chapter 4 Technical Report 3: Validation of baseline Retrieval with rawinsondes Document: Technical Report 3 Written by: Paolo Antonelli Date: 30 October 2010 Reference: PA/IIS/TR03/2010/01 4.1 Inversion with UWPHYSRET UWPHYSRET is a research tool built on a matlab implementation of a Bayesian retrieval system. It allows for retrieval of atmospheric parameters from high spectral resolution infrared observations. The package is based on LBLRTM (version 11.7) and Optimal Spectral Sampling (OSS) for the computation of simulated radiances and jacobians. The system works with IASI observations, however the modular nature of the package makes it suitable to extend its use to other current and future high spectral resolution instrument such as AIRS, and MTG-IRS. It allows for simultaneous retrievals of: vertical profiles of temperature, water vapor mixing ratio, Carbon Dioxide, Ozone, and surface temperature and emissivity. 4.1.1 IASI observations used in retrieval IASI L1C data were collected over an area of 1x1 degree around Pratica di Mare, Udine Campoformido, and Cagliari, Italy. Before inversion data were thinned using the MAIA cloud mask. Only observations labeled more than 98% clear were retained. Observations over Udine Campoformido and Cagliari, were compressed and reconstructed according to the procedure described in PA/ScP/2010/04 Annex 1 using Principal Components derived at EUMETSAT. 59 Ref.: PA/IIS/FR/2010/01 4.1.2 A-priori Covariance: in-situ observations used for climatology Site dedicated a-priori covariance were obtained using available rawinsondes, observed prior June 2007 in Pratica di Mare, Italy. Rawinsondes were provided by the Centro Nazionale per la Meteorologia e la Clima tologia Areounautica (CNMCA) of the Italian Air Force. Observed vertical profiles were extrapolated up to 0.1 [hP a] using climatological profiles, and were quality controlled for saturation and/or missing values, Ozone and Carbon Dioxide were generated by random perturbations of climatological profiles. Surface Temperature was randomly generated from lowest level temperature, constrained by surface type, time of day, and latitude. Climatological covariance matrix was regularized using Singular Value Decomposition. For this experiment first guess emissivity was derived from Dan Zhou emissivity atlas. Retrieval of surface emissivity was enabled. 4.1.3 Error Covariance Error covariance matrix for the 4021 channels, between 670 and 2200 cm−1 used in the inversion, was obtained by increasing the IASI nominal noise as provided by CNES by 30% to account for forward model errors. The error covariance matrix was considered purely diagonal. 4.1.4 Forward Model Optimal Spectral Sampling forward model (from Atmospheric & Environmental Research, Inc.) was used in this experiment to allow for faster computation of the retrievals. 4.1.5 Minimization Scheme Minimization scheme used to derive retrievals was based on the Levenberg–Marquardt approach: �−1 � T −1 � � Ki S� [y − F (xi )] − Sa−1 [xi − xa ] xi+1 = xi + (1 + γ) Sa−1 + KiT S�−1 Ki (4.1) where xi is the state vector at iteration i, Sa is the a-priori covariance described in sec. 4.1.2, K is the Jacobian of the forward model F , S� is the observation covariance matrix described in sec. 4.1.3, and with a variable γ. Starting values was set to γ = .75 and increments were set of factor 1.5 for χ2 ratio > .75 and decrements of a factor 2 for χ2 ratio < .25, where (4.2) χ2 = [y − F (y)]T S�−1 [y − F (xi )] and y and F (y) are respectively observed and simulated radiances. 4.1.6 Convergence Test Convergence test used to stop iterative retrieval process was done, as suggested by C. Rodgers [17], in the atmospheric profile space (rate of change in retrieved profile): d2i = (xi − xi+1 )T S�−1 (xi − xi+1 ) � n (4.3) 60 Ref.: PA/IIS/FR/2010/01 Figure 4.1: Temperature retrieval error calculated according to Rodgers [17]. where close to convergence: �−1 � S� = Sa−1 + KiT S�−1 Ki (4.4) and n is the number of elements of the state vector. 4.1.7 Retrieval Errors Retrieval total error was estimated according to Rodgers [17]: � �−1 Ŝ = K T S�−1 K + Sa−1 (4.5) while its smoothing and measurement components were estimated according to the following equations: �−1 −1 � T −1 �−1 � Ss = K T S�−1 K + Sa−1 Sa K S� K + Sa−1 (4.6) � �−1 −1 � T −1 �−1 Sm = K T S�−1 K + Sa−1 S� K S� K + Sa−1 (4.7) and are shown in figures 4.1, and 4.2. 61 Ref.: PA/IIS/FR/2010/01 Figure 4.2: Water Vapor retrieval error calculated according to Rodgers [17]. 4.2 Validation strategy In order to guarantee the accuracy of the results, the retrievals obtained with UWPHYSRET were first tested using observed radiances (spectral validation). Only retrievals which passed the spectral test were considered successful and useful for environmental validation. 4.2.1 Spectral Validation Retrievals on available IASI observations were validated spectrally by comparing the retrieval residuals, calculated by subtracting radiances simulated off retrieved profiles, using the OSS forward model, to observed radiances reconstructed after PCA compression. In the comparison the reconstructed radiances were preferred to original observations because of the noise filtering properties of PCA [7] compression. The residuals obtained were averaged in 5 spectral regions and were compared (in terms of Brightness Temperature) to the mean observation error (in BT) used in the retrieval process and described in sec. 4.1.3. The 5 spectral regions are indicated in tbl. 4.1 and were selected to guarantee that observations were properly fit in the 14 µm carbon dioxide band (accuracy of vertical profile of temperature), in the 9.7 µm ozone band (accuracy of ozone vertical distribution), in the 6.7 µm water vapor band (accuracy of vertical distribution of water vapor), in the 12 µm window (accuracy of surface emissivity and surface temperature), and in the 11 µm band (accuracy of surface emissivity and surface temperature, presence of high load of aerosols and/or presence of thin cirrus clouds). Only retrievals with average residuals ± one standard deviation, smaller than average observation error, in all 5 bands were considered successful. 62 Ref.: PA/IIS/FR/2010/01 Range in cm −1 CO2 670 − 775 Win 775 − 990 O3 990 − 1070 Win 2 1090 − 1120 H2 O 1240 − 2100 Total 670 − 2200 Table 4.1: Spectral regions used to to validate retrieval residuals. 4.2.2 Environmental Validation Spectral validation was needed to verify that retrieved profiles indeed produced synthetic radiances whose distance from real observations was within the observation error. However the spectral validation tests do not guarantee that the retrieved profiles are always accurate and/or realistic. For example, in presence of thin cirrus clouds, the inversion system is often capable to retrieve the atmospheric state variables associated to spectral residuals smaller then the observation noise, however the effect of the cirrus cloud on the observed radiances, not modeled in the forward model calculation, is spread over the retrieved variables causing errors of several degrees K in temperature and of several g/kg in water vapor mixing ratio. For this reason before using the level 2 products to generate level 3 (instability indices) it is important to combine the spectral validation with an environmental validation. The remaining part of this document describes the procedure used to perform the environmental validation and the results achieved. 4.2.2.1 In situ observations used for environmental validation At each site, retrievals were compared with rawinsonde profiles, observed between July 2007 and December 2009, and obtained from CNMCA. Profiles were extrapolated up to 0.1 [hP a], and were quality controlled for saturation and/or missing values. Original rawinsonde observations at high vertical resolution (with single profile measurements made every 2 sec) were pressure averaged per layer according to the following equation: � � Xi +Xi−i N ∗ (Pi − Pi−1 ) � 2 (4.8) Xl = (Plow − Phigh ) i=1 where Xl is the atmospheric parameter to be averaged (T, WV) in the layer l, i is the i-th of the N sublevels that divide the layer l, and Plow and Phigh are the pressure extremes of the layer in consideration. Surface Temperature associated to the profile was randomly generated from lowest level temperature, constrained by surface type, time of day, and latitude. Retrievals labeled as successful by the spectral validation, were compared to available pressure averaged rawinsondes. The comparison was done to ensure that difference were within ranges expected because of time and space differences and not by other factors. 4.2.2.2 Statistical quantities used to characterize environmental validation Mean of the differences between the M rawinsondes variables, Y , and the M retrieved variables, X at each level i: �M j=1 Xi,j − Yi,j (4.9) δ̄i = M 63 Ref.: PA/IIS/FR/2010/01 and their standard deviation: δˆi = � � �� � � M (X − Y ) − (X − Y ) 2 � j=1 i,j i,j i,j i,j M (4.10) were calculated for each inverted spectrum. X and Y could be Temperature, and Water Vapor mixing ratio. Statistical quantities in eq. 4.9 and 4.10 provided an indication of how retrieved profiles compare to observations made by rawinsondes generally launched within 3 hrs and 50 km from satellite overpass. 4.3 Results This section describes the results obtained comparing the rawinsonde, pressure averaged according to the procedure outlined in secion 4.2.2.1, with the retrieved profiles which passed the spectral validation test. 4.3.1 Udine Campoformido Using the MAIA cloud mask, 2519 spectra were labeled as clear sky over an area of 1x1 degree centered in Udine Campoformido (lat : 46.03N, lon : 13.18E) for the time period of July 2007 - December 2009. After inversion of the PCA noise filtered radiances, 702 retrievals passed the spectral test described in section 4.2.1. Retrievals obtained from the morning orbit (AM) were 236 corresponding to 33.6% of the total number of total number of spectrally successful retrievals, while 466 were associated with the afternoon overpasses (PM, corresponding to 66.4%). Mean and standard deviation of the distance between the retrieved profiles and the closest (in any case within 200 min) rawinsonde launched after satellite overpass are shown in figures 4.3, 4.4, 4.5, and 4.6. It is worth noting that: 1. mean temperature distance at the surface for AM overpasses (blue line) was found to have the expected negative sign, being δ¯ defined in eq. 4.9 the difference between the retrieval and the rawinsonde, and being the retrieval obtained, on average, 116 min before the sonde launch, and its magnitude ( ∼ 1 K) is smaller than the PM one (∼ 3 K), figure 4.3; 2. large mean temperature deviation ( ∼ 2 K) were found around the tropopause, figure 4.3; 3. results for AM overpasses were characterized by large mean deviation in water vapor mixing ratio ( ∼ 1.5 g/kg), figure 4.4; 4. standard deviation of temperature was found to be large at the tropopause, and at the surface, with AM overpasses having larger values, figure 4.5; None of the mentioned points seemed to indicate pathological behavior of the retrieval. 64 Ref.: PA/IIS/FR/2010/01 Figure 4.3: Udine Campoformido: mean temperature distance between retrieved profiles and closeset rawinsonde. Blue line represents the mean distance for morning (AM) overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. Figure 4.4: Udine Campoformido: mean water vapor mixing ratio distance between retrieved profiles and closeset rawinsonde. Blue line represents the mean distance AM overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. 65 Ref.: PA/IIS/FR/2010/01 Figure 4.5: Udine Campoformido: standard deviation of temperature distance between retrieved profiles and closeset rawinsonde. Blue line represents the mean distance for morning (AM) overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. Figure 4.6: Udine Campoformido: standard deviation of water vapor mixing ratio distance between re trieved profiles and closeset rawinsonde. Blue line represents the mean distance AM overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. 66 Ref.: PA/IIS/FR/2010/01 Figure 4.7: Pratica di Mare: mean temperature distance between retrieved profiles and closeset rawinsonde. Blue line represents the mean distance for morning (AM) overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. 4.3.2 Pratica di Mare Using the MAIA cloud mask, 4489 spectra were labeled as clear sky over an area of 1x1 degree centered in Pratica di Mare (lat : 41.65N, lon : 12.43E) for the time period of July 2007 - December 2009. After inversion 1328 retrievals passed the spectral test described in section 4.2.1. Retrievals obtained from the morning orbit (AM) were 430 corresponding to 32.4% of the total number of total number of spectrally successful retrievals, while 898 were associated with the afternoon overpasses (PM, corresponding to 67.6%). Mean and standard deviation of the distance between the retrieved profiles and the closest (in any case within 200 min) rawinsonde launched after satellite overpass are shown in figures 4.7, 4.8, 4.9, and 4.10. Results over Pratica di Mare showed some inconsistencies: 1. mean temperature distance at the surface for AM overpasses (blue line) was found to have an unex ¯ defined in eq. 4.9, represents the difference between pected positive sign. As in the Udine case, δ, the retrieval and the rawinsonde, with the retrieval obtained, on average, 116 min before the sonde launch. Positive values were expected for PM overpasses but not for the AM ones, as showed in figure 4.7; 2. AM overpasses results were characterized by a sharp peak in the mean deviation in water vapor mixing ratio ( ∼ 2.5 g/kg) between 850 and 900 hP a, figure 4.8. Both these results were unexpected and seemed to indicate problems with the retrievals. 67 Ref.: PA/IIS/FR/2010/01 Figure 4.8: Pratica di Mare: mean water vapor mixing ratio distance between retrieved profiles and closeset rawinsonde. Blue line represents the mean distance AM overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. Figure 4.9: Pratica di Mare: standard deviation of temperature distance between retrieved profiles and closeset rawinsonde. Blue line represents the mean distance for morning (AM) overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. 68 Ref.: PA/IIS/FR/2010/01 Figure 4.10: Pratica di Mare: standard deviation of water vapor mixing ratio distance between retrieved profiles and closeset rawinsonde. Blue line represents the mean distance AM overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. 4.3.3 Cagliari Using the MAIA cloud mask, 6314 spectra were labeled as clear sky over an area of 1x1 degree centered in Cagliari (lat : 39.25N, lon : 9.05E) for the time period of July 2007 - December 2009. After inversion 1262 retrievals passed the spectral test described in section 4.2.1. Retrievals obtained from the morning orbit (AM) were 340 corresponding to 26.9% of the total number of total number of spectrally successful retrievals, while 922 were associated with the afternoon overpasses (PM, corresponding to 73.1%). Mean and standard deviation of the distance between the retrieved profiles and the closest (in any case within 200 min) rawinsonde launched after satellite overpass are shown in figures 4.11, 4.12, 4.13, and 4.14. Results over Cagliari showed the some inconsistencies that were found for Pratica di Mare: 1. mean temperature distance at the surface for AM overpasses (blue line) has the unexpected positive sign. As in the Udine case, δ̄, defined in eq. 4.9, represents the difference between the retrieval and the rawinsonde, with the retrieval obtained, on average, 116 min before the sonde launch. Positive values were expected for PM overpasses but not for the AM ones, as showed in figure 4.11; 2. AM overpasses are characterized by a sharp peak in the mean deviation in water vapor mixing ratio ( ∼ 2.5 g/kg) between 850 and 900 hP a, figure 4.12. Both these results were un-expected and seemed to indicate problems with the retrievals. 69 Ref.: PA/IIS/FR/2010/01 Figure 4.11: Cagliari: mean temperature distance between retrieved profiles and closeset rawinsonde. Blue line represents the mean distance for morning (AM) overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. Figure 4.12: Cagliari: mean water vapor mixing ratio distance between retrieved profiles and closeset rawinsonde. Blue line represents the mean distance AM overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. 70 Ref.: PA/IIS/FR/2010/01 Figure 4.13: Cagliari: standard deviation of temperature distance between retrieved profiles and closeset rawinsonde. Blue line represents the mean distance for morning (AM) overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. Figure 4.14: Cagliari: standard deviation of water vapor mixing ratio distance between retrieved profiles and closeset rawinsonde. Blue line represents the mean distance AM overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. 71 Ref.: PA/IIS/FR/2010/01 Figure 4.15: Pratica di Mare: mean temperature distance between retrieved profiles and closeset rawinsonde over water. Blue line represents the mean distance for morning (AM) overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. 4.4 Analysis of Results To further investigate the inconsistencies in the results obtained over Pratica di Mare and Cagliari (both coastal regions), the mean of the distances for temperature and water vapor were calculated for Field of View (FOVs) with more and less than 50% of water surface. Figures 4.15, and 4.16 show δ¯T and δ¯M R over water for Pratica di Mare, while figures 4.17, and 4.18 show the same quantities over land. Positive values at the surface for the AM overpasses were found in both cases, but the effect was found to be much larger over land (with discrepancies of 5 − 6 K, and 4 g/kg), than over water (with discrepancies of 1 K, and 2.5 g/kg). In order to verify that indeed the positive values at the surface are not due to climatological effects, all the rawinsonde available over Pratica di Mare, at 06 : 00 and 18 : 00 U T C, were compared to the rawinsonde launched 6 hrs (360 min) later. The comparison was performed for the whole period between July 2007 and December 2009. Climatological signal for AM overpasses was found to have large (5 K) differences in temperature close the surface (figure 4.19) and smaller (2 K) differences for PM overpasses. Water vapor mixing ratio differences did not show any significant signal between 850 and 900 hP a, and differences in proximity of the surface were larger for PM overpasses then for AM ones (figure 4.20). By comparing the climatological rawinsonde signal in figure 4.19 to the values of δ¯T and δ¯M R obtained 72 Ref.: PA/IIS/FR/2010/01 Figure 4.16: Pratica di Mare: mean water vapor mixing ratio distance between retrieved profiles and closeset rawinsonde over water. Blue line represents the mean distance AM overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. Figure 4.17: Pratica di Mare: mean temperature distance between retrieved profiles and closeset rawinsonde over land. Blue line represents the mean distance for morning (AM) overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. 73 Ref.: PA/IIS/FR/2010/01 Figure 4.18: Pratica di Mare: mean water vapor mixing ratio distance between retrieved profiles and closeset rawinsonde over land. Blue line represents the mean distance AM overpasses. Red line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals. Figure 4.19: Pratica di Mare: mean temperature distance between consecutive rawinsondes. Blue and red lines represent the mean distance for AM and PM overpasses respectively. 74 Ref.: PA/IIS/FR/2010/01 Figure 4.20: Pratica di Mare: mean water vapor mixing ratio distance between consecutive rawinsondes. Blue and red lines represent the mean distance for AM and PM overpasses respectively. over Pratica di Mare and showed in figures 4.7, it was that PM overpasses had a good agreement with climatological rawinsonde signal, while AM overpasses showed large discrepancies. It is worth emphasizing that: • the rawinsonde signal is associated to a 6 hrs time difference, 3 times larger than the time difference between satellite overpass and rawinsonde launch, therefore the magnitude of the climatological signal is expected to be larger than the retrieval-rawinsonde differences; • the climatological signal extracted from rawinsonde is associated to land only differences, while the retrieval-rawinsonde differences were calculated over both land and ocean, with the ocean thermal capacity being quite different from the land one. Given that the PM differences were found to be realistic and close to what could be expected from climatol ogy, it was unlikely that AM suspicious differences were caused by wrong surface emissivity or pathologies of the a-priori information (which would affect also the PM cases). One possible explanation was identified in the higher probability of cloud contamination (especially from thin cirrus clouds) for the AM overpasses and over land. To validate this hypothesis the retrieved profiles were screened for potential saturation simply by retaining all the profiles that had relative humidity (calculated over water) smaller than altitude dependent threshold values (empirically chosen to prove the concept). Figure 4.22 shows how neglecting the retrieval with potential saturation causes a shift in temperature towards climatological (negative) values close to the surface. The same concept was found to be evident also in the histogram of the temperature differences calculated over the whole set, and over the RH screened set, at 986 and 1014 hP a showed in figure 4.23. While the number of “good” retrievals decreased of about 13 , and while all the AM retrievals over land were labled as contaminated by clouds, the distribution of the values of temperature differences near the surface was found to move correctly towards negative values. 75 Ref.: PA/IIS/FR/2010/01 Figure 4.21: Pratica di Mare: mean temperature distance between retrievals and rawinsondes: blue and red solid lines represent the mean distance for AM and PM overpasses respectively. Mean temperature distance between consecutive rawinsondes are shown by blue and red dashed lines for AM and PM overpasses respectively. However while the RH based correction takes care of the issues near the surface, the problems between 800 and 900 hP a were found to be still persistent both in temperature (figure 4.22) and water vapor mixing ratio (figure 4.24). It is possible that better threshold have to be determined, however the problem might also be related to instability of the retrievals due to improper use of the a-priori covariance matrix. UWPHYSRET, in its current form, retrieves all the variable an the high level of correlation between different levels could be a source of instability while calculating the inverse of the a-priori matrix (used in the iterative equation 4.1). For this reason Principal Component Analysis should be applied to the profiles before inversion, and the equation should be applied to a lower dimensional (compressed) state vector. RH correction had similar impact on data retrieved over Cagliari, while it had milder impact on the data retrieved over Udine, indicating that AM cloudiness might be more of a factor over land in coastal areas, than over more inland (like in the Udine Campoformido) cases. 4.5 Conclusions Retrievals obtained for the three areas under investigation, were validated spectrally to ensure proper functioning of the inversion package used (UWPHYSRET). Spectral validation was performed comparing the retrieval residuals to the observation error used in the inversion process. Only retrievals whose spectral residuals were found to be consistently smaller than the observation error were considered successful. 76 Ref.: PA/IIS/FR/2010/01 Figure 4.22: Pratica di Mare: mean temperature distance between retrievals and rawinsondes: blue dia monds solid lines represent the mean distance for AM overpasses for retrievals before (left) and after (right) RH correction. Mean temperature distance between consecutive rawinsondes is shown in cian circles. Figure 4.23: Pratica di Mare: histogram of the mean temperature distance between retrievals and rawin sondes at 986 (blue) and 1014 hP a (red) before (left) and after (right) RH correction. Figure 4.24: Pratica di Mare: mean temperature distance between retrievals and rawinsondes: blue dia monds solid lines represent the mean distance for AM overpasses for retrievals before (left) and after (right) RH correction. Mean temperature distance between consecutive rawinsondes is shown in cian circles. 77 Ref.: PA/IIS/FR/2010/01 Spectral validation was followed by an environmental validation, where the retrieved temperature and water vapor profiles were compared to available rawinsondes launched in the areas of interest, within 200 minutes from satellite overpass. While the differences between retrievals and rawinsondes obtained over Udine seemed to be within expected values, anomalies were found for Pratica di Mare and Cagliari, especially with morning overpasses. A detailed analysis of the issue, indicated that a significant part of the AM retrievals might have been contaminated by clouds, especially over land. Removing the potentially cloud contaminated retrievals improved the quality of the results near the surface, however did not have an impact on anomalies found in both temperature and water vapor mean differences between 800 and 900 hP a. This issue should be further investigated. Final outcome of the validation study is that retrieved profiles could be used to generate instability indices (level 3 products) however it is strongly recommended that all the profiles available over the 1x1 degree areas for a given overpass, should be averaged before generating level 3 data, to minimize the possible side effects introduced by clouds. 78 Chapter 5 Technical Report 2: Instability Indices (Level 3 Products) derived from IASI retrievals (Level 2 Products) Document: Technical Report 2 Written by: Paolo Antonelli and Silvia Puca Date: 25 August 2010 Reference: PA/IIS/TR02/2010/02 5.1 Introduction This document is the second report of activities for the project on atmospheric instability derived from IASI observations. The document follows up the first report (Reference: PA/IIS/2010/01) and describes preliminary results obtained after the optimization of UWPHYSRET. For this study about 1500 IASI observations have been collected in a 1 by 1 degree box centered in Pratica di Mare, Italy, for the time period July – September 2007. The clear sky observations where inverted and instability indices were derived from the retrievals and were compared to those derived from available rawinsondes launched in Pratica di Mare at 12:00 and 00:00 UTC. Good agreement was found between instability derived from satellite and from rawinsondes. Also good agreement was found between electric activity (lightning) and instability derived from the rawinsonde. Future investigation will make use of a convection detection system based on satellite (SEVIRI) data. 5.2 Observations For this study 1565 IASI L1C observations were collected in an area of 1x1 degree around Pratica di Mare, Italy (lat : 41.65N, lon : 12.43E) for the time period July – September 2007. Before inversion data 79 Ref.: PA/IIS/FR/2010/01 Figure 5.1: Spatial distribution of retrievals (red diamonds) used for assessment of PCA impact on level 2 product accuracy over Pratica di Mare. Blue circle indicates location of rawinsonde launches. were thinned using the MAIA cloud mask, 757 observations were labeled as clear sky (corresponding to FOVs more than 98% clear) and 469 lead to convergence in the retrieval process. However part of these observations were found to be contaminated by clouds not detected by MAIA, and after the spectral and environmental validation only 250 retrieved profiles (showed in figure 5.1, and hereafter indicated with M ) were considered usable to derive Instability Indices. Along with the IASI observations also a total of 154 rawinsondes launched at 12 : 00 (11 : 00 UTC actual launch) and 00 : 00 UTC (23 : 00 UTC actual launch) were collected. Location of rawinsonde site is showed in blue in figure 5.1. Rawinsonde data were made available CNMCA of the Italian Air Force. CNMCA also provided for this study an estimate of the lightning activity during the time period of the study. Lightning data were used, in this stage, without adequate quality control, and some problems were reported for July and September 2007. For this reason further investigation will be required on Lightning observations. 80 Ref.: PA/IIS/FR/2010/01 Figure 5.2: Mean Temperature distance (black line) and Standard Deviation of Temperature distance between rawinsondes and retrievals. 5.3 Retrievals Retrievals were performed with UWPHYSRET, a physical retrieval package developed at the Space Science Engineering Center of the University of Wisconsin – Madison. Out of the 757 clear sky observations, 250 lead to successful retrievals (location showed in red in figure 5.1). Retrievals were considered successful when spectral residual were within the estimated observation error, and no saturation was found in the retrieved profiles. Statistics of the Temperature and Water Vapor Mixing Ratio distances between retrievals and time co-located rawinsondes (Retrieval – Rawinsonde) are shown in figures 5.2 and 5.3. Considering that the mean time difference between the IASI retrieval and the effective launch time of the rawinsonde is about 160 minutes, and the space distance between IASI observations and rawinsonde launch site ranges from 0 to 50 km, the quality of the retrievals was considered satisfactory. Temperature deviation of 1.5 K in the boundary layer was expected because 65% of the successful retrievals were associated to the evening overpasses (around 19:30 UTC) and therefore were compared to colder (in the boundary layer) rawinsonde profiles. Large discrepancies between 100 and 200 hPa have to be further investigated but to do not impact the study on the lower tropospheric instability. 5.4 Instability Indices Instability Indices were derived from rawinsondes and retrieved profiles with Sound_Analys, a python based software package developed by A. Manzato at the Osservatorio Meteorologico Regionale (OSMER) of the Agenzia Regionale per la Protezione dell’Ambiente del Friuli Venezia Giulia (ARPA-FVG). Time series of CAPE and Lifted Index for the months of July (figures 5.4, and 5.7), August (figures 5.5, and 5.8), and September (figures 5.6, and 5.9), 2007. Red circles indicates the values of CAPE (LI) derived form 00:00 and 12:00 UTC rawinsondes; blue diamond shows the mean values of CAPE for all the retrievals co-located with a given rawinsonde; blue error bars are associate to the variability of the IASI derived 81 Ref.: PA/IIS/FR/2010/01 Figure 5.3: Mean Water Vapor Mixing Ratio distance (black line) and Standard Deviation of Water Vapor Mixing Ratio distance between rawinsondes and retrievals. CAPE between all the retrievals co-located with a given rawinsonde; black squares show the number of lightning observed within 10 hours from rawinsonde launch in a 1x1 degree box centered over Pratica di Mare. In figures 5.7, 5.85.9 the black squares indicate the log of the number of lightning. The agreement between the values of CAPE and LI, derived from rawinsonde and retrievals is encouraging at this stage. More conclusive results will be available only after processing data for the summers of 2008, 2009 and 2010. Also general agreement between rawinsonde derived CAPE and LI and detection of electric activity is found. More conclusive results will be achieved with the use of NEFODINA, a convection detection scheme developed at CNMCA. 5.5 Conclusions Results presented in this document represent the outcome of a preliminary study on instability derived from satellite (IASI) data. About 1500 IASI observations were collected in a 1 by 1 degree box centered in Pratica di Mare, Italy, for the time period July – September 2007. After using the MAIA cloud detection scheme, the clear sky observations where inverted with UWPHYSRET, a physical inversion scheme developed at the University of Wisconsin-Madison, and instability indices (CAPE and LI) were derived from the retrievals, with Sound_Analys a software package developed at OSMER. Instability indices derived from IASI retrievals were compared to those derived from available rawinsondes launched in Pratica di Mare at 12:00 and 00:00 UTC. Good agreement was found between instability derived from satellite and from rawinsondes. Also good agreement was found between electric activity (lightning) and instability derived from the rawinsonde. Future investigation will make use of a convection detection system based on satellite (SEVIRI) data. 82 Ref.: PA/IIS/FR/2010/01 Figure 5.4: Time Series of CAPE (J/Kg) for the month of July 2007. Red circles indicates the values of CAPE derived form 00:00 and 12:00 UTC rawinsondes; blue diamond shows the mean values of CAPE for all the retrievals co-located with a given rawinsonde; blue error bars are associate to the variability of the IASI derived CAPE between all the retrievals co-located with a given rawinsonde; black squares show the number of lightning observed within 10 hours from rawinsonde launch in a 1x1 degree box centered over Pratica di Mare. 83 Ref.: PA/IIS/FR/2010/01 Figure 5.5: Time Series of CAPE (J/Kg) for the month of August 2007. Red circles indicates the values of CAPE derived form 00:00 and 12:00 UTC rawinsondes; blue diamond shows the mean values of CAPE for all the retrievals co-located with a given rawinsonde; blue error bars are associate to the variability of the IASI derived CAPE between all the retrievals co-located with a given rawinsonde; black squares show the number of lightning observed within 10 hours from rawinsonde launch in a 1x1 degree box centered over Pratica di Mare. 84 Ref.: PA/IIS/FR/2010/01 Figure 5.6: Time Series of CAPE (J/Kg) for the month of September 2007. Red circles indicates the values of CAPE derived form 00:00 and 12:00 UTC rawinsondes; blue diamond shows the mean values of CAPE for all the retrievals co-located with a given rawinsonde; blue error bars are associate to the variability of the IASI derived CAPE between all the retrievals co-located with a given rawinsonde; black squares show the number of lightning observed within 10 hours from rawinsonde launch in a 1x1 degree box centered over Pratica di Mare. 85 Ref.: PA/IIS/FR/2010/01 Figure 5.7: Time Series of Lifted Index for the month of July 2007. Red circles indicates the values of CAPE derived form 00:00 and 12:00 UTC rawinsondes; blue diamond shows the mean values of LI for all the retrievals co-located with a given rawinsonde; blue error bars are associate to the variability of the IASI derived LI between all the retrievals co-located with a given rawinsonde; black squares show the log of the number of lightning observed within 10 hours from rawinsonde launch in a 1x1 degree box centered over Pratica di Mare. 86 Ref.: PA/IIS/FR/2010/01 Figure 5.8: Time Series of Lifted Index for the month of August 2007. Red circles indicates the values of CAPE derived form 00:00 and 12:00 UTC rawinsondes; blue diamond shows the mean values of LI for all the retrievals co-located with a given rawinsonde; blue error bars are associate to the variability of the IASI derived LI between all the retrievals co-located with a given rawinsonde; black squares show the log of the number of lightning observed within 10 hours from rawinsonde launch in a 1x1 degree box centered over Pratica di Mare. 87 Ref.: PA/IIS/FR/2010/01 Figure 5.9: Time Series of Lifted Index for the month of September 2007. Red circles indicates the values of CAPE derived form 00:00 and 12:00 UTC rawinsondes; blue diamond shows the mean values of LI for all the retrievals co-located with a given rawinsonde; blue error bars are associate to the variability of the IASI derived LI between all the retrievals co-located with a given rawinsonde; black squares show the log of the number of lightning observed within 10 hours from rawinsonde launch in a 1x1 degree box centered over Pratica di Mare. 88 Chapter 6 Technical Report 1: Dataset description Document: Technical Report 1 Written by: Paolo Antonelli Date: 16 October 2010 Reference: PA/IIS/TR01/2010/01 6.1 Introduction This document describes the data collected for the project Evaluating atmospheric instability from high spectral resolution IR satellite observations supported under contract EUM/CO/10/4600000746/SAT. The dataset used and delivered to EUMETSAT consist of observations over three areas of 1x1 degree centered in Pratica di Mare, Udine, and Cagliari as shown in figure 6.1. Observations include: 1) IASI data ; 2) lighting data; 3) Rawinsonde data for 00:00 and 12:00 UTC, launched by CNMCA in the airforce bases located in the three areas. Data collected and delivered cover the time period from 01 July 2007 to 30 September 2009. At time of delivery only a subset of NEFODINA is available, and it is not included in delivery. 6.2 IASI data For the study 13322 IASI L1C observations were collected in the areas of 1x1 degree around Pratica di Mare, Udine Campoformido, and Cagliari Italy for the time period July 2007 – September 2009. Before inversion, IASI data were thinned using the MAIA cloud mask. After the spectral and environmental validation only a fraction of the retrieved profiles (showed in figures 6.2, 6.5, and 6.8), were considered usable to derive instability indices. Number of available retrievals for each satellite overpass range from 0 to about 10. IASI spectra were PCA noise filtered for spectral validation purposes. Both original and PCA noise filtered observations are contained in the netcdf files XXX_IASI_ORIGINAL_RAD_SUCC_RET.nc and 89 Ref.: PA/IIS/FR/2010/01 Figure 6.1: Areas of Interest XXX_IASI_FILTERED_RAD_SUCC_RET.nc (where XXX can be PDM, UDI, CAG). An example of the file structure as generated by ncdump can be found in Annex 1. 6.2.1 Pratica di Mare (lat : 41.65N, lon : 12.43E) A total of 1328 spectra passed the spectral validation (described in PA/IIS/TR03/2010/01) over Pratica di Mare, 430 (corresponding to 32.4%) were associated to morning overpasses, while 898 (corresponding to 67.6%) were collected in the evening. Figure 6.3 shows the mean (blue curve) and the mean±std (red curves) of the observed radiances. Figure 6.4 shows the distribution, for the whole dataset, of: 1) the Field Of View (FOV) angle; 2) the estimated FOV water percentage; 3) the estimated FOV surface elevation; 4) the FOV IGBP Land Cover Classification; 5) the array detector detector index. 6.2.2 Udine Campoformido (lat : 46.03N, lon : 13.18E) After the spectral validation only 702 retrieved profiles (showed in figure 6.5) were considered usable to derive instability indices. Out of the 702 spectra, 236 (corresponding to 33.6%) were collected in the morning overpass, while 466 (corresponding to 66.4%) were observed in the evening. Figure 6.3 shows the mean (blue curve) and the mean±std (red curves) of the observed radiances. Figure 6.4 shows the distribution of: 1) the Field Of View (FOV) angle; 2) the estimated FOV water percentage; 3) the estimated FOV surface elevation; 4) the FOV IGBP class; 5) the array detector detector index. Both original and PCA noise filtered observations are contained in the netcdf files UDI_IASI_ORIG_RAD_SUCC_RET.nc and UDI_IASI_FILT_RAD_SUCC_RET.nc. 90 Ref.: PA/IIS/FR/2010/01 Figure 6.2: Spatial distribution of IASI observations (red dots) collected over Pratica di Mare. Figure 6.3: Mean and Standard Deviation of IASI observations collected over Pratica di Mare. 91 Ref.: PA/IIS/FR/2010/01 Figure 6.4: Statistics of IASI observations collected over Pratica di Mare. 92 Ref.: PA/IIS/FR/2010/01 Figure 6.5: Spatial distribution of IASI observations (red dots) collected over Udine Campoformido. Figure 6.6: Mean and Standard Deviation of IASI observations collected over Udine Campoformido. 93 Ref.: PA/IIS/FR/2010/01 Figure 6.7: Statistics of IASI observations collected over Udine Campoformido. 6.2.3 Cagliari (lat : 39.25N, lon : 9.05E) After the spectral validation only 1262 retrieved profiles (showed in figure 6.8) were considered usable to de rive instability indices. IASI spectra were PCA noise filtered for spectral validation purposes. Both original and PCA noise filtered observations are contained in the netcdf files CAG_IASI_ORIG_RAD_SUCC_RET.nc and CAG_IASI_FILT_RAD_SUCC_RET.nc. Out of the 1626 spectra, 340 (corresponding to 26.9%) were collected in the morning overpass, while 922 (corresponding to 73.0%) were observed in the evening. Figure 6.9 shows the mean (blue curve) and the mean±std (red curves) of the observed radiances. Figure 6.10 show the distribution of: 1) the Field Of View (FOV) angle; 2) the estimated FOV water percentage; 3) the estimated FOV surface elevation; 4) the FOV IGBP class; 5) the array detector detector index. 94 Ref.: PA/IIS/FR/2010/01 Figure 6.8: Spatial distribution of IASI observations (red dots) collected over Cagliari. Figure 6.9: Mean and Standard Deviation of IASI observations collected over Cagliari. 95 Ref.: PA/IIS/FR/2010/01 Figure 6.10: Statistics of IASI observations collected over Cagliari. 6.3 Lightning Lightning data were provided by CNMCA in ASCII format, for areas of 1x1 degree centered in Pratica di Mare, Udine, and Cagliari. Observations were made by LAMPINET, the lightning network of Servizio Meteorologico Aeronautica Militare. Observed variables are the Magnetic Direction Findings (MDF) and the Time Of Arrival (TAO). Geolocation was performed through both TOA and MDF. Network coverage is shown in figure 6.11. Issues related to some of the sensor not working properly were reported by CNMCA personnel for the summer of 2007, in particular for the months of July and August 2007, therefore the number of events for this time period is considered underestimated. Netcdf files with lightning data contains: 1) the number of occurrences over the 1x1 degrees areas within the 10 hr following the rawinsonde launch; 2) time of rawinsonde launch; 3) mean latitude and longitude of lightning events observed in the10 hr following the rawinsonde launch; 4) standard deviation of latitude and longitude of lightning 96 Ref.: PA/IIS/FR/2010/01 Figure 6.11: LAMPINET network coverage with isolines of estimated geolocation error. Courtesy of CNMCA. events observed in the 10 hr following the rawinsonde launch. Delivered files are XXXX_LGT_DATA.nc (where XXX can be PDM, UDI, CAG). File structure as generated by ncdump of PDM_LGT_DATA.nc can be found in Annex 2. 6.3.1 Pratica di Mare Figure 6.12 shows the time series of the lightning occurrences within the10 hr following the rawinsonde launch over Pratica di Mare. 6.3.2 Udine, Campoformido Figure 6.13 shows the time series of the lightning occurrences within the10 hr following the rawinsonde launch over Udine, Campoformido. 97 Ref.: PA/IIS/FR/2010/01 Figure 6.12: Pratica di Mare: Time series of lightning occurrences in the 10 hr following the rawinsonde launch. 98 Ref.: PA/IIS/FR/2010/01 Figure 6.13: Udine: Time series of lightning occurrences in the 10 hr following the rawinsonde launch. 99 Ref.: PA/IIS/FR/2010/01 Figure 6.14: Cagliari: Time series of lightning occurrences in the 10 hr following the rawinsonde launch. 6.3.3 Cagliari Figure 6.14 shows the time series of the lightning occurrences within the10 hr following the rawinsonde launch over Cagliari. 6.4 Rawinsondes Original data provided by CNMCA were obtained by VAISALA RS-92 sonde. Since July 2005 the Italian Meteorological Service has discontinued the use of the VAISALA RS-90 rawinsonde, to introduce the RS-92 sonde. Technical specifications of the rawinsonde are showed in the original VAISALA data sheet in figure 6.15. Observations collected at Pratica di Mare, Udine Campoformido, and Cagliari. 6.4.1 Pressure interpolated profiles For this study the rawinsonde profiles were extrapolated up to 0.1 [hP a], and were quality controlled for saturation and/or missing values. Original rawinsonde observations at high vertical resolution (with 100 Ref.: PA/IIS/FR/2010/01 Figure 6.15: VAISALA RS-92 specs from http://www.vaisala.com/ 101 Ref.: PA/IIS/FR/2010/01 single profile measurements made every 2 sec) were pressure averaged per layer according to the following equation: � � Xi +Xi−i N � 2∗(Pi −Pi−1 ) Xl = (6.1) (Plow − Phigh ) i=1 where Xl is the atmospheric parameter to be averaged (T, WV) in the layer l, i is the i-th of the N sublevels that divide the layer l, and Plow and Phigh are the pressure extremes of the layer in consideration. Surface Temperature associated to the profile was randomly generated from lowest level temperature, constrained by surface type, time of day, and latitude. Observations collected at Pratica di Mare, Udine Campoformido, and Cagliari are stored in netcdf files (XXX_RAWINSONDE_101L_PROF.nc where XXX can be PDM, UDI, CAG). File structure as gener ated by ncdump can be found in Annex 4. 6.4.2 Pratica di Mare The 1757 rawinsondes collected at Pratica di Mare airport are stored in the PDM_RAWINSONDE_101L_PROF.n Examples of the statistical properties of the profiles for Pratica di Mare are shown in figures 6.16 (Mean Temperature) 6.17 (Standard Deviation of Temperature), 6.18 (Mean Water Vapor Mixing Ratio), and 6.19 (Standard Deviation of Water Vapor Mixing Ratio). Figures show profiles for different launch time (06 : 00 U T C in blue, 12 : 00 U T C in red, 18 : 00 U T C in magenta, 00 : 00 U T C in green) and for the whole day (black dashed line). 6.4.3 Udine, Campoformido The 2069 rawinsondes collected at Udine Campoformido are stored in the UDI_RAWINSONDE_101L_PROF.nc 6.4.4 Cagliari The 1858 rawinsondes collected at Cagliari Elmas airport are stored in the CAG_RAWINSONDE_101L_PROF.nc 102 Ref.: PA/IIS/FR/2010/01 Figure 6.16: Pratica di Mare: Mean of Temperature profiles interpolated on 101 levels. 103 Ref.: PA/IIS/FR/2010/01 Figure 6.17: Pratica di Mare: Standard deviation of Temperature profiles interpolated on 101 levels. 104 Ref.: PA/IIS/FR/2010/01 Figure 6.18: Pratica di Mare: Mean and Standard deviation of Water Vapor Mixing Ratio profiles inter polated on 101 levels. 105 Ref.: PA/IIS/FR/2010/01 Figure 6.19: Pratica di Mare: Standard deviation of Water Vapor Mixing Ratio profiles interpolated on 101 levels. 106 Chapter 7 Conclusions This document describes the results obtained by two forecast systems for thunderstorms (events with more than 10 lightning strikes within 11:00 and 17:00 UTC for the time period April - October) over the Po Valley, and all the work performed to achieve the implementation of the two systems. The first system, based on an artificial neural network trained using instability indices derived from rawinsondes launched in Milano Linate, and Udine Campoformido between 2004-2010, led to excellent results with a prediction PSS scores of 0.68. The second system designed to replicate the first one but with predictors derived only from IASI level 3 (instability indices) and level 1 (radiances). The capacity of the IASI trained ANN to predict convection was found poor and led to a final PSS of 0.21. Poor results obtained in the generalization of the prediction of convective event from IASI data and products, were found to be mostly dependent on the limited size of the IASI database (available retrievals in clear sky conditions) which is a factor 10 smaller than the rawinsonde database (both for training and testing). However a general tendency of the retrievals to overestimate low level water vapor, which led to overestimation of the atmospheric instability, was found and should be further investigated. Finally by focusing on a single area of interest, over Milano, Italy we were able to increase the size of the IASI database by a factor two, and the prediction PSS score on the test set reached the value of 0.49, indicating that nowcasting of convection by IASI data over individual (smaller) areas, and therefore with larger datasets, improves considerably. In other words the experiment over Milano demonstrated the feasiblity and potentiality of the satellite based nowcasting system. Besides the final scores, significance of the presented material relies on the correlation found between some of the IASI radiances and the occurrence of convection, and on the validation of the IASI level2 and level 3 products. 107 Authors Paolo Antonelli is a researcher who works at 60% for SSEC of the University of Wisconsin - Madison. His expertise is in the area of high spectral resolution data inversion and data compression. Agostino Manzato is a scientist of OSMER ARPA FVG. His expertise ranges from meteorology to statistical learning. Silvia Puca is a scientist of Italian Department of Civil Protection. Her expertise is in the areas of convection and statistical learning. Lt. Col. Francesco Zauli is the head of the satellite group at CNMCA. His expertise is in the area of satellite meteorology. 108 Acknowledgments The authors wish to thank Dr. R. Stuhlmann and Dr. S. Tjemkes of EUMETSAT for their construc tive comments and their continuous support; Cap. A. Vocino of CNMCA for reviewing the documents and managing the program, Mr. R. Garcia of SSEC, for is invaluable help in gathering IASI data; Cap. Davide Melfi, Daniele Biron, of CNMCA for their kind help in gathering lighting and auxiliary data; many thanks to Dr. A. Van Delden for reviewing the core part of this document and for taking part to the final presentation. Activities described in this report were supported by EUMETSAT, through grant EUM/CO/10/4600000746/SAT, and SSEC, through grants NOAA NA06NES4400002 and NASA NNX07AK89G, for the use of IASI data in predicting instability and for main development of UWPHYS RET respectively. Finally many thanks go to OSMER ARPA FVG for making available precious human resources and expertise to the project. 109 Bibliography [1] Paolo Antonelli. Experiment on pca compression impact on iasi level 2 product for atmospheric tem perature and water vapor mixing ratio, and for surface temperature. Technical report, EUMETSAT, 2010. [2] Paolo Antonelli. Statistical properties of iasi noise reconstructed after pca compression. Technical report, EUMETSAT, 2010. [3] Paolo Antonelli. Validation of baseline retrieval with rawinsondes. Technical report, EUMETSAT, 2010. [4] Paolo Antonelli. Validation of level 3 products derived from vertical rawinsonde and retrieval profiles with occurrence of convection as detected by lightnings. Technical report, EUMETSAT, 2010. [5] Paolo Antonelli, R. Knuteson, R. Garcia, S. Bedka, D. Tobin, J.Taylor, W. Smith, and H. Revercomb. Uwphysret an ssec inversion package for high resolution infrared data based on lblrtm. 4th Workshop on Sounding from High Spectral Resolution Infrared Observations, Madison, WI, 15-18 September 2008, September 2008. [6] Paolo Antonelli and Silvia Puca. Instability indices (level 3 products) derived from iasi retrievals (level 2 products). Technical report, EUMETSAT, 2010. [7] Paolo Antonelli, H. E. Revercomb, W. L. Smith, R.O. Knuteson, L. Sromovsky, D.C. Tobin, R. K. garcia, H. B. Howell, H.-L. Huang, and F.A. Best. A principal component noise filter for high spectral resolution infrared measurements. Journal of Geophysical Research, 109, 2004. [8] Ian T. Jolliffe and David B. Stephenson. Forecast verification: A practitioner’s guide in atmospheric science. International Journal of Forecasting, 22(2):403–405, 2006. [9] A. Manzato. The use of sounding-derived indices for a neural network short-term thunderstorm. WEATHER AND FORECASTING, 20:896–916, 2004. [10] A. Manzato. A verification of numerical model forecasts for sounding-derived indices above udine, northeast italy. Weather Forecasting, pages 477–495, 2007. [11] Agostino Manzato. A climatology of instability indices derived from friuli venezia giulia soundings, using three different methods. Atmospheric Research, 67(68):417–454, 2003. 110 Ref.: PA/IIS/FR/2010/01 [12] Agostino Manzato. An odds ratio parameterization for roc diagram and skill score indices. WEATHER AND FORECASTING, 20:918–930, 2005. [13] Agostino Manzato. A note on the maximum peirce skill score. WEATHER AND FORECASTING, 22(5):1148–1154, 2007. [14] Agostino Manzato. Sounding-derived indices for neural network based short-term thunderstorm and rainfall forecasts. Atmospheric Research, 83:349–365, 2007. [15] Agostino Manzato and Griffith Morgan Jr. Evaluating the sounding instability with the lifted parcel theory. Atmospheric Research, 67(68):455–473, 2003. [16] Plunkett and Elman. Exercises in Rethinking Innateness. MIT Press, 1997. [17] Clive Rodgers. Inverse Methods for Atmospheric Soundings: Theory and Practice, volume 2. World Scientific Publishing Co. Pte. Ltd., 2000. [18] Ian H. Witten and Eibe Frank. Data mining: practical machine learning tools and techniques. Morgan Kaufmann, 2005. [19] MH Zweig and G. Campbell. Receiver-operating characteristic (roc) plots: a fundamental evaluation tool in clinical medicine. Clin Chem, 39:561–577, 1993. 111
Documenti analoghi
A benthic quality index for European alpine lakes
stressed by acidification and from deep lakes subjected to eutrophication; for other lake types (the ones
included in the Mediterranean areas for example) and for other pressures (hydro-morphologic...