The Application of COSMIC Data to Global Change Research

The constellation observing system for meteorology, ionosphere, and climate (COSMIC) is well-suited to climate research, especially as it per­ tains to climate modeling. It presents a challenge to climate models, which are currently tuned to match climate mean states, by providing precisely calibrated data which can be analyzed according to two methods that are insensitive to standard model tuning. Those two methods are climate signal detection and second-moment statistics, both of which consider the most useful climate model to be the one which provides the best predictions rather than the one which best recreates the current climate. In this paper we discuss these two new, alternative approaches to improving climate models and how COSMIC occultation data can be analyzed in this context. Climate signal detection is usually applied to determine what trends in a climate data set can be described by external effects, such as increasing greenhouse gas concentrations, sulfur dioxide aerosols, etc. Here we show that it is actually a method to test climate models. By examining climate trends and anomalies as revealed by COSMIC data, we can test whether climate models reproduce those trends and anomalies. We describe in de­ tail how trends and anomalies can be extracted from COSMIC occultation data and the relevance it should have to climate models. The fluctuation dissipation theorem, as applied to the climate, shows that a second-moment analysis of a climate model's output will reveal more about its physical soundness than does the mean states it produces. While this theorem shows how a Green's function for climate change can be de­ rived from the second-moments of the climate system, it is best applied by comparing like second-moments in data and in model output. This method of testing models is most likely to reveal which parameterizations of con­ vection and moisture dispersion are most appropriate. The two methods of improving climate models are discussed in the con­ text of COSMIC, showing how the occultation data can be processed to apply each of them. 1 Jet Propulsion Laboratory, Pasadena, CA USA 2 Department of Meteorology, Texas A\&M University, College Station, Texas, USA *Corresponding author address: Dr. Stephen S. Leroy, Jet Propulsion Laboratory, MS 169·237,4800 Oak Grove Dr., Pasadena, California 91109 USA; E-mail: Stephen.S.Leroy@jpl.nasa.gov


INTRODUCTION
Carbon dioxide concentration in the atmosphere is rising at about 0.4% per year, almost entirely due to emissions of fossil fuel combustion, and it is expected to cause increases in global temperature measurements. All of this is completely addressed in the reports of the Intergovernmental Panels on Climate Change (IPCC) (Houghton et al., 1991;Houghton and Meira Filho, 1996). The overriding focus of the research summarized in the IPCC reports is to understand the effects greenhouse gas forcing has had in the past and the effects it is likely to have in the future. In this paper we do not choose to outline all of those issues addressed in the IPCC reports but instead outline only climate signal detection and attribution and the statisti cal testing of climate models. We lay out the mathematics of each keeping in mind how occul" tation data obtained by COSMIC will apply. A commonly occuring theme will be the impor tance of testing and improving climate models.
The issue of detecting changes in the climate and attributing those changes to specific "forcings" is a general one which has most frequently been applied to warming of Earth's surface together with anthropogenic greenhouse gas increases. While the "attribution" prob lem is highly relevant for public policy purposes, its scientific value is just beginning to be put into context: detection and attribution tells more about the quality of climate models th an it does about cause and effect in the climate system. Namely, the effects of any specific climate forcing can only be hypothesized by the use of a climate model-global climate models (GCMs) being the most sophisticated-which themselves give highly inconsistent results. In fact, GCMs predict anywhere from a 2 K to a 5 K increase in surface temperature due to a doubling of carbon dioxide concentration. Clearly, the models themselves cannot be satisfactory predic tors of climate change yet, so it is data that we must rely upon to update the models. Hence, commonly it is the climate "sensitivity", or the surface temperatu_re increase caused by a dou bling of C02, which is sought out in data. Here we lay out the most current versions of climate signal detection, how one would implement occultation data from COSMIC, describe recent results in the literature, and show rigorously how this process is related to the testing and improving of GCMs.
Secondly, the common methods of validation of GCMs as laid out in the IPCC reports is generally well-suited to mimicking the current climate state but is ill-suited to guaranteeing accurate and precise predictions of future climate states. The IPCC 1995 report opens its dis cussion of model evaluation by stating that it takes the approach of Oreskes et al. (1994) which endorses the matching of known climate states between model and data. Very often this is accomplished by adjustment of parameterizations of sub-gridscale processes such as cloud microphysics, air-sea interactions, turbulence, etc. When this is done, it is difficult to know whether the change is physically based, especially since the changes are frequently imple mented to guarantee overall stability of the model (see discussion on flux adjustment in the IPCC 1995 report). If the physics of the model is not reasonable, then predictions of future changes in the climate state cannot be reasonably trusted. Hence, in this paper we propose two methods of using COSMIC data to test a climate model according to its physics, which we believe make it more believable as a predictor of future climates .

CLIMATE SIGNAL DETECTION
To what degree is the warming of the last century caused by increasing concentrations of anthropogenic greenhouse gases? Could the warming possibly be a natural fluctuation of the climate system instead? In order to answer these questions, Barnett and Schlesinger (1987) suggested that the spatial pattern of the warming ought to distinguish it from patterns of natu ral variability. This technique is known generally as "fingerprinting," a signal having a unique fingerprint which would distinguish it from other suspected signals. At first the fingerprinting technique was applied as a simple correlation between the expected signal and the data to be used (c.f. Santer et al., 1995). Whenever such a correlation is done, it is always expected that even in the absence of a signal-to-be-detected that some "noise" leaks through. This noise is the result of the "natural variability" of the climate, and hence research into the problem of detecting a signal came to be answered in terms of signal-to-noise ratios (Hasselmann 1979). Subsequently others used conventional signal processing methods to show that a kind of filter could be devised which would minimize the amount of noise that would "leak" into a detec tion-a kind of optimization of the correlation technique (Bell 1982;Bell 1986;North et al., 1995). Finally , it was shown that several different signals can be detected simultaneously in an optimal fashion (Hasselmann 1993) . Here we will call this optimal multi-pattern regression.
The main premise in climate signal detection studies is that the pattern of the signal as it would appear in the data can be adequately described by a GCM (or simpler model). Although the initial aim was to detect global warming and attribute it to an enhanced greenhouse effect, in reality any "externally" forced signal can be sought in signal detection. For instance, natu rally occuring phenomena, such as a change in solar insolation or suspended ash from a volca nic eruption, can be viewed as external forcings. Even phenomena internal to a climate system can be treated as external forcings, such as a sea-surface temperature warming associated with an El Nif i o.

Correlation Studies
Assume we have a data set d. This data set can be comprised of any atmospheric or oceanic variable of interest or any combination thereof, such as coefficients of a Fourier trans form or of a spherical harmonic expansion. In most studies, the data is grouped according to time, d(t). It is important that th e data be relative to a mean. We wish to find whether a signal is present in that data.
In all detection studies, we assume that the pattern of the signal s can be modeled by a computer to within a scaling factor a. Such a pattern is computed by subtracting a control run of a GCM with the d esired external forcing absent from another run of the same GCM with the external forcing present. The data can then be modeled by d= as +n, 190 TAO, Vol. JI, No. 1, March 2000 and in the case of a timeseries of data, the data is modeled by For now we will consider that s is a strictly sp atial pattern and thus the scaling factor a evolves in time. In both cases n represents the noise of natural climate variability as it would appear in the data. It is assumed that a mean has been subtracted from the data. In ap plying the correlation, the best estimate of the signal amplitude a is found taking the inner product of the data with the signal pattern: in which s2 = s · s. The error as sociated with this estimate of the signal amplitude is found by computing the root-mean-square difference of (iia) assuming th at the climate noise n is Gaussian. The error, a a, is given by ( 4 ) s The notation ( .. . ) indicates an ensemble average. When dealing with a timeseries of data d(t), the uncertainty is estimated by taking n as the output from the control run without the external forcing . The signal to noise ratio is then simply SNR = ii I a a and the confidence level of detection is erf ( SNR/ ..J2).
The inner product can be defined in any of a variety of ways . The data can take the form of a vector of similar (or even non-similar) elements. For instance, it can be only temperatures, or a combination of temperatures and humidities. The data can also take the form of a func tional wherein the inner product becomes an integral over volume. In any case, a problem arises because in this method the inner product does not necessarily remain invariant under coordinate transformation. This means that different combinations of the same data can yield very different results for the signal-to-noise ratio. In practice, it leaves undetermined how one should weight data in data-sparse regions: how does one assign the appropriate amount of weight to surface temperature measurements in oceanic regions where there is very little data?
While this method can give a fa irly robust estimate of the confidence level of detection for a single pronounced signal, it is inappropriate should the signal be weak or other strong signals be present. The fo rmer is a concern with some global warming detection work in the literature Tett et al., 1996). In that work, the authors assembled a signal shape using a linear combination of the effects of enhanced greenhouse warming, cooling by sulfate aero sols, and stratospheric cooling by ozone loss to best match the signal seen in the data. Their results ar e that they find the signal with a high level of confidence, but in a way this is circular reasoning. A better way to tackle the problem must be with a signal detection method which accounts for multiple signals.

Multi-pattern Regression
We can adapt the correlation method ab ove to handle the problem of multiple signals. The i=l It is assumed that there are n signals present, each with its own signal pattern s; and amplitude a; which evolves in time. One can arrive at a best estimate of the signal amplitudes as a function of time by a least-X 2 method. One defines X 2 by /l x\ t) = (d(t)-L. a 1(t) s ;)2' in which the elements of the matrix G are given by G . . = s. · s . and the elements of the vector . The uncertainti is given as an uncertainty covariance matrix A, the elements of which are defined as Au = \ oa1 &x j) in which oa1 refers to a departure from the best estimate of oaj. The uncertainty covariance matrix is in which the elements of T are Tu = ( ( s1 • n) (s j • n)) .
One can check quickly that the amplitudes of the signals can be determined uniquely: the presence of one signal theoretically cannot be confused with the presence of another. This works because in detecting a particular signal, one is effectively correlating a part of that signal which is unique to it alone and does not correlate with any of the other signals. In reality, though, it is impossible to formulate the signal patterns s1 perfectly, and, through G·1, the errors in their forms will cause confusion when distinguishing between signals.
Even though the difficulty of dealing with multiple signals is resolved by this method, the problem of transformation invariance remains. By taking different combinations of the same data, one can arrive at different conclusions. The only way to avoid this difficulty is to define a metric which makes X 2 invariant under transformation. This metric will al so optimize the detection, as we will see in the next section. Curiously, this method of multi-pattern regression has not been applied in detecting cli mate signals. For studies such as th ose by Santer et al. (1996), it would only have req uired the computation of the 3-by-3 matrix G. North et al. (1995) derived a set of coefficients to be used in correlation which minimizes the amount of noise that would leak into the detection using th e theory of optimal fi ltering.

Optimization
The model is that of equation 1, and the noise is assumed Gaussian with eµ and Aµ being the eigenvectors/eigenfunctions (empirical orthogonal functions-EOFs) and eigenvalues of the noise covariance matrix ( n n r) . Their result is The sum over µ is arbitrary: any subset of the EOFs can be chosen. T he error in the estimate of a is just This technique resolves the problem of transformation invariance: the result only depends on the type of data used and not on a particular linear combination of the data. Also, the data (and signal pattern and EOFs) can be spatial-temporal in nature, functional or vectorial in form. For instance, if surface temperature measurements are used, the result for a would not depend on how oceanic-based stations are weighted in comparison to land-based stations in the composition of the data d.
This optimization can be generalized to multi-pattern regression as shown in Hasselmann ( 1997). In this case, the best estimates for the signal detection amplitudes a are given by a = r-1 II, where the elements of the matrix r and the vector II are and _ °"' (e µ ·s) (eµ ·d) The uncertainty covariance in the signals' amplitudes is As in the multi-pattern regression described earlier, this method also is able to perfectly distinguish between signals with different patterns. As before, if the patterns are described imperfectly, this technique will confuse the presence of one signal for another. This last set of equations is state-of-the-art in climate signal detection. It is referred to as optimal multi-pat tern regression.
The form presented here can be read in either functional form or vector form, depending on how the inner product is defined. Because ultimately this problem is applied numerically, it is most appropriate to think of the quantities d, and s; and e µ as vectors. As such, the noise covariance matrix can be written as µ and optimal multi-pattern regression can be found by using this noise covariance matrix as a metric in defining a new X 2 (Hasselmann 1997;Leroy1999): In this vector fo rm, the i'th column of the matrix S is the individual pattern S;. Minimiz ing X 2 by varying a gives (19) which is the same result as given by equations 13, 14, and 15 in vector form. In practice, the inverse of the noise covariance matr ix is composed of a subset of the EOFs by N-l "'"' T �l = ,L.eµeµ 11.,µ.
(20) µ In general, those EOFs with smaller associated variances Aµ are omitted because the amount of information available to compute the noise covariance matrix N does not justify small variance modes (c.f. Hegerl et al., 1997). (Notice that NN·1 no longer re duces to an identity matrix but to L µeµe/ .)

Application of COSMIC Data
Most of the global warming detection studies to date have utilized a -120-year record of temperatures from meteorological stations distributed worldwide or a -50-year record of ra diosonde soundings. Naturally, these in situ records do not give satisfactory coverage over the oceans, so future de tection studies using systems which obtain global coverage are desireable. The one space-based system which has been utilized in global warming detection studies is the Microwave Sounding Unit (MSU) on the U.S. series of TIROS Operational Vertical Sounder (TOVS) weather satellites (Spencer and Christy 1992a;Spencer and Christy 1992b ). The channel 2 radiance measured by MSU is representative of a bulk temperature of the troposphere; inter estingly, the MSU channel 2 data show a much weaker temperature trend than do surface te mperatures, a result which is in disagreement with current climate modeling.
COSMIC will be but the first in a series of occultation constellation experiments per formed by different organizations around the world. As such, we view COSMIC as the bench mark from which other constellations of GPS receivers can be used to detect global warming.
Data from COSMIC will be well-suited to the task of de tecting global warming because it is highly sensitive to tropospheric warming and is subject to only very small systematic errors. Leroy (1997) pointed out that GPS occultation can directly measure the geopotential heights of constant pressure surfaces from space. The measurement is dependent only on the sounding of the atmosphere above the layer of interest: the refractivity, which is directly proportional to density in the atmosphere above the mid-troposphere , is integrated fr om the top of the atmo sphere downward to give pressure as a function of absolute height. By evaluating the geopotential energy as a function of the absolute height (at the horizontal position of the occultation) we deduce geopotential height as a function of pressure above the mid-troposphere. (For details on GPS occultation retrieval, see Hajj et al., 1999;Kursinski et al., 1997). Thermodynami cally, the geopotential height is a natural measure of the average temperature of the atmo sphere below it h( p ) =JP. , RT dp ' + h ", P µ go p' where his geopotential height, h,.is the surface geopotential height, p is pressure, P. is the surface pressure, R is the ideal gas constant, Tis temperature, µ is the mean molecular mass, and g0 is a standard value of gravitational acceleration. Thus, even though the measurement accuracy is independent of the atmospheric state below the layer of interest, the measured quantity is nonetheless a measure of the average temperature below that layer.
GPS occultation of Earth's atmosphere can only directly observe geopotential heights of constant pressure surfaces above the mid-troposphere because water vapor begins to contrib ute substantially to the refractivity in the lower troposphere, making the interpretation of re fractivity in terms of more conventional atmospheric parameters ambiguous. While measur ing the geopotential heights of pressure surfaces above the mid-troposphere is probably suffi cient for global warming detection, it is valuable for reasons mentioned later to retain informa tion on the lower atmosphere. Thus, we propose that the parameter to be retained for detection �ork and other work mentioned in following sections should be an integrated refractivity in which h is geopotential height, N is refractivity, x is the horizontal coordinate, and the constant a = 0.044024 Pa m1• Above the mid-troposphere, N is just the atmospheric pressure.
In the lower troposphere, N can be interpreted as the "weight" of refractivity above the geopotential height h which is similar to the column amount of water vapor ab ove that height.
We propose that the units of N be called "equivalent Pascals." We envision that the d�ta to be used in the optimal filter will be the spherical harmonic expansion coefficients of N at a few geopotential heights as a function of time. The coeffi cients will be denoted as N1 � �.s) (h,t), in which l is the spherical harmonic degree, m is the order, c,s refer to a sine or cosine coefficient, his the geopotential height, and tis the time period. The actual field is represented as 1=0 m=O in which ¢ is longitude, () is latitude, and P1,,..,. (sin8) is an associated Legendre polynomial (Abramowitz and Stegun 19 7 2). Notice that N1 :n =0 when m=O. For an n-degree expansion, there are (n+ 1)2 coefficients. If we wish to retain information which describes the atmosphere on continental scales, a degree-5 expansion is required, which gives 36 coefficients. Climate signal detection requires that data be gridded on timescales at least a month in duration. With COSMIC, roughly 110000 occultations will be collected each month, covering the entire globe and all times-of-day (sun-fixed longi!udes). Figure 1 shows a simulated distri bution of soundings by COSMIC for one day. Call N; (<Pi, 8;, t1, h) the data from sounding i at longitude and latitude ( ¢1, 81) at time t; during a month. A set of coefficients B1 �::l is required to convert these data to the set of spherical harmonic coefficients: -- We use N1 ( h) as a shorthand for N;( ¢1, 81 ,t;, h). There are a variety of methods to determine the coefficients B1 �::). Since the spherical harmonic coefficients are really intended to be a measurement of the atmospheric mean over a month, we would prefer that the conversion coefficients be chosen so that the differences between the spherical harmonic coefficients determined by equations 24 and 25 and the true coefficients for the month are as small as possible. The implication is that an optimal method should be used to determine the conver sion coefficients B1 �:;); however, the extremely large number of data involved makes the determination of the conversion coefficients prohibitive. The section on Bayesian interpola tion in Leroy (1997) shows one alternative method of deriving such conversion coefficients.
The difference between the calculated spherical harmonic coefficients and the true spherical harmonic coefficients which describe the mean state of the atmosphere for the month is called"sampling" error. This error arises because the data does not completely sample the atmosphere in time and space. In between the data locations, spatially and temporally, there are maxima and minima in the integrated refractivity field which are missed by the occultation Fig. 1. A simulated distribution of occultations by COSMIC for one day. Taken from Stevens (1998).
data. Albeit the variability which contributes to the error has short time and spatial scales, this variability is associated with synoptical scale weath e r systems. Li and North (1998) shows how to estimate the sampling error for a convers ion fi lter such as that given by equations 24 and 25. The result is an error covariance matrix in the elements N, �.s)(h), which should be added to the climate variability covariance matrix N for climate signal detection. This will alter the EOFs , generally increasing the eigenvalues Aµ. An estimate of the sampling error for data distributed like that expected from COSMIC has not yet been done.
In addition to noise due to natural climate variability and sampling error, one must also consider systematic error. This error arises because no observing system measures atmospheric variables perfectly. In this case, a vertical profile of refractivity deduced from an occultation, computed by an Abel transform, does not recreate the actual atmospheric profile precisely primarily because the horizontal resolution of the occultation, about 200 km, is too large to capture spatial water vapor variability in the lower troposphere . While this particular error will average to zero with 110000 monthly occultations, there are others which are not expected to average out. These are systematic. The dominant contributor to the systematic error is caused by the ionosphere: solar-cycle variability induces variability in ionospheric activity which in tum induces error in refr activity retrievals, especially in the stratosphere. This error translates to a 100-m error in geopotential height at 40-km altitude falling exponentially to a 1-m error in the mid-troposphere. While some of this error should cancel by averaging over a month, some will not. Once this error is propagated into the values of N1 �'")(h), its error covariance matrix should be added to the climate variability covariance matrix N just as the sampling error is. See Kursinski et al. (1997) for a discussion of errors related to GPS occultation.

Work to Date
Using correlation and fingerprinting techniques, recent studies have shown that surface temperature increase over the past century can be attributed to anthropogenic greenhouse gas increases with a confidence level of ""' 95% (Hegerl et al., 1997 and references therein). This must be interpreted as there being a 5% chance that a natural fluctuation of the climate can fully account for the increase of 0.6 K seen over the past century. None of these studies consid ered that while a 0.6 K fluctuation of the global average surface temperature is rare, it is even rarer that this increase occur steadily-almost monotonically-over the course of 100 years. Even more recent studies consider the temporal structure of the signal as well, and confidence levels have increased to 99% and higher (Tett et al., 1999;Wigley et al., 1998). In short, no GCM has ever shown a global scale warming of 0.6 K occuring on a 100-year timescale in all the millenia of contro l runs, a point originally made by Stouffer et al . (1994).
Even though greenhouse-gas forcing seems to be required to explain the rise in surface temperature, nevertheless there are still unexplained fe atures in the surface temperature record. The warming seems to occur in the time intervals 1910-1940 and 1970-present with little change in between. See Fig. 2, the data of which is described in Jones (1994) and Parker et al. (1995). When all data is included in optimal multi-pattern regression, the minimum X 2 far exceeds the number of degrees of freedom, meaning that the fit is unsatisfactory. That is, there are still unexplained fe atures in th e temporal pattern of surface temperature over the last cen-  197 tury. To this day this pattern of warming intervals and steady intervals remains unexplained although some have suggested that variations in solar forcing might be responsible. Leroy (1999) and Stevens (1998) have performed optimal filtering studies which probe how global warming detection will be impacted by COSMIC. Even though no data has been obtained yet, these types of studies can still attempt to determine how warm the climate must become before we can unmistakably attribute that warming to increased concentrations of greenhouse gases. Leroy (1999) estimated that a 1-sigma detection time of detection is about 10 years, a much shorter time than if only surface temperatures are used. This result is most sensitive to the overall pattern of tropospheric warming and stratospheric cooling. Philosophi cally, though, a cooling stratosphere is not evidence that the troposphere is warming because of increased greenhouse gas concentrations. Stevens (1998) appropriately separated the tropo spheric and the stratospheric patterns in determining how much each contributes to global warming detection. He found that using the thickness of the troposphere (geopotential height of the -100-hPa surface) adds very little information over surface temperature measurements. This is not surprising, primarily because it is expected that temperature fluctuations are strongly coupled throughout the troposphere.
Both Leroy (1999) and Stevens (1998) have demonstrated a commonly occuring problem in optimal detection of climate signals. It was mentioned previously that any subset of EOFs and their eigenvalues can be used in optimal detection. For example, Leroy (1999) estimated the signal pattern N(h) for global warming in the Indian Ocean region (c.f. Fig. 3). In Fig. 4 he shows the spectrum of eigenvalues A.µ and the square-projections (eµ · s)2• Notice that while the spectrum of A.µ cascades strongly, the square-projections do not. By inspection of equation 10, it seems that a nearly infinite signal-to-noise ratio can be obtained for detection. This should not be true. First, as mentioned previously, there is not enough information in the control run to justify the smallest eigenvalues A.µ. Second, the higher order EOFs tend to represent the finer, smaller-scale features of the signal pattern, details in which one cannot have much confidence. Therefore, a subjective truncation procedure must be undertaken wherein some modes are retained and the others eliminated. Different researchers take different approaches to this truncation ( c.f. Leroy 1999;Hegerl et al., 1996;Hegerl et al., 1997;North and Stevens 1998;Stevens 1998, etc.). Perhaps the most satisfying method is to use only those modes which pass a test of statistical significance before being used to parameterize the climate natural variability (Tett et al., 1999).
Most of the preparatory work required to use COSMIC occultation data for signal detec tion is in estimating the natural climate variability. Stevens (1998) uses CCM3 to estimate the variability of geopotential heights in the tropics, which is directly related to integrated refrac tivity above the mid-troposphere. Other models should be used as well, for the sake of com parison. In addition, additional information may be obtained from tracking the annual cycle. The EOFs in the annual cycle are "cyclostationary" modes (Kim et al., 1996).
In passing, it is worth noting that the fingerprinting approach to detecting climate change, premised on an approximate linearity of the climate system, might prove outdated should the view of Corti et al., (1999) be affirmed. They surmise that instead of the chaotic climate sys tem react linearly to an external forcing that instead it visits its preferred states (attractors) with steadily changing frequency. It is not clear how fingerprinting or regression can be ap plied should such a mechanism be at work.

DETECTION AS MODEL TESTING
While climate signal detection was originally motivated by the need to detect and at tribute climate change to human activity, some scientists are now using this technique to test climate models in several different ways. For example, Forest et al., (1997) shows how opti mal signal detection can be used to estimate parameters of simplistic climate models. This technique can be especially appropriate for forcings on shorter timescales than global warm ing, such as local oceanic warmings, volcanic eruptions, etc. Also, when signal detection as presented in the previous section is formulated in Bayesian terms, it can be shown that by itself it tests climate models.

Bayesian Priors and Climate Model Predictions
In statistical language, optimal detection is finding the conditional probability of signal amplitudes given a data set. This is written as P(ald). It is Gaussian in a with a centroid at a and an uncertainty covariance matrix of A. In statistics this is written as a -N(a,A), which means (26) where n,, signals are being detected (and is the dimension of a) and l···I is a determinant.
200 TAO, Vol. 11, No. 1, March 2000 Recall that a is computed using the data d (c.f. equation 19). Implicit in finding this probabil ity distribution for a is that the shapes of the signals, as prescribed by S (or S; ), are known better a priori than the overall amplitudes of the signals a , since the signal amplitudes are treated as unknowns. In fact, subjectively weyrobably have a better idea of the overall ampli tudes of signals than we do of their spatial scales at high resolution. Therefore, it is only appropriate that we account for our a priori knowledge of signal amplitudes.
In Bayesian statistics, equation 26 is only half of the problem of inference: the other half is the "prior" in the signal amplitudes, since the posterior is a probability density in the signal amplitudes. The prior is a probability distribution in signal amplitudes which depends on the theory, or model, that is used to establish the probability distribution. It is written as p(alM) in which M denotes that model Mis used to determine the prior probability distribution. Bayes' s theorem states that P(ald,M) oc P(dla)p(alM), (27) wherein P( aid, M) is the "posterior" (posterior probability distribution of a), P( dla) is the "likelihood" (likelihood of the data), and p(alM) is the "prior" (a priori knowledge of a given model M). The normalization constant for the posterior P( dlM) is given by integrating over a: P(dlM);;; f P(dla)p(alM)da.

(28)
The Bayesian likelihood function P(dla) is directly proportional to the posterior probability P(ald) given by more traditional statistics in equation 26.
It is the prior p(alM) which is uniquely Bayesian. It must be interpreted as a prior expec tation of what the signal amplitudes should be before any data is considered, effectively mak ing it subjective. Those expectations derive from the best theories of the climate system avail able: GCMs.
At this point it is worth noting how global warming predictions are currently done. Usu ally, one works with a single model M. First, a long control run is done with that model to establish how the climate system would behave without any external forcing. A climatic mean state is determined. Then the model is run with a particular external forcing imposed. The difference between the first run and the second run is the signal. Usually only the overall spatio-temporal form of the signal is retained, allowing a rescaling of the overall amplitude (as was done in the first part of this paper). In reality, we also have a prediction for the signal amplitude as well.
In order to establish a proper prior the model also needs to assign uncertainties to its predicted a. Most work assumes that the uncertainty derives from the climate variability as generated by the model; however, the true uncertainty derives from the uncertainties of the model. For instance, whereas the equations of motion incorporated into the atmospheric and oceanic models are precise, some error is incurred by implementing them in finite difference form. Most important, though, are the uncertainties associated with the parameters of the model, those parameters being necessary to describe processes in the ocean-atmosphere-biosphere cryosphere system which occur on sub-gridscales. Examples of such parameters would be associated with schemes of moist convection, cloud formation/properties, surface run-off, etc.
Stephen S. Leroy & Gerald R. North

201
The uncertainties in these parameters are subjective, but they can be estimated reasonably.
Preferably, then, the uncertainty in the climate signal prediction should be estimated by run ning the model, with the external forcing included, several times, each time with different but reasonable values of internal parameters. This is computationally extremely expensive, but alternative approaches to such a calculation have been proposed (Webster and Sokolov 1997).
Because the likelihood function is generally much more sharply peaked than the prior (otherwise there was no reason to obtain the data to begin with), the prior will have little effect on the posterior probably distribution. If more than one model is used to generate a prior, however, the prior becomes very useful in a second level of inference. If a relative prior prob ability for model} is p(M1 ), then the posterior relative probability of that model in light of the data dis p(M 1 ld): wherein the function P(dlM 1 ) is evaluated using equation 28. If we assume that the prior distribution in a-a model's prediction-is a� N(a P ,A P ), then the conditional probability of the data on the model is (30) where the "accuracy" A, the posterior most probable signal amplitudes am, and the posterior amplitude uncertainty covariance matrix are given by A= (d-Sam) r N-1( d-Sam) + (a , ,, -a r) r A � 1 (a ,,, -a P ) , (3 1 ) A,,, = (k1 + A � 1f1 , in which a is taken from equation 19.
Throughout these equations the subscript p refers to the prior probability distribution, the subscript m refers to the posterior probability distribution, and the tilde refers to the data like lihood function. The accuracy A given by equation 31 can be thought of as the accuracy of the prediction. The first term on the right is the disagreement between the data and the most likely fit described by am, and the second term on the right is the disagreement between the outcome and the prior prediction. The higher the accuracy A the less accurate the result.
The conditional probability of the data on the model gives a relative probability of one model to another model. Equation 28 indicates that the less accurate the result, the less rela tively probable the model is in light of the data. This is obvious from experience. What is new with a Bayesian formulation is that the more precise the prediction (re: the smaller the prior uncertainty covariance matrix l A P l ) the more relatively probable the model is in light of the data. A model is more useful scientifically if it can give more precise predictions than another model while maintaining good accuracy. What is understood intuitively about precision is made quantitative with Bayesian inference. In this way signal detection can be shown to be a special case of model testing (Leroy 1998 While detecting warming associated with increased greenhouse-gases has been discussed at length, the duration of the resulting climate change makes it unsuitable for use in testing and improving GCMs. There are other external forcings which are better suited, e.g., a sea-surface temperature warming associated with an El Nifio even or a volcanic eruption. For an El Nifio forcing, we expect the dominant model response to be a change in the global pattern of pre cipitation and in patterns of tropical convection. The relevant model parameterizations would be those associated with surface-atmosphere interaction, with moist convection, and with cloud formation and evolution. For volcanic forcing, we expect the dominant model response to be regional cooling of the troposphere. The pattern of such a cooling bears upon radiative transfer parameterizations and circulation/transpose in the stratosphere. Differences between modeled and observed responses to external forcings will not only be in the amplitude but also in the overall pattern. Since the pattern is strictly defined in finger printing, it seems that the method given above may not be well-suited to improving climate models. The only improvement to a model will be an adjustment of its parameters, something that is best done with a climate adjoint. Such an adjoint would be similar to those used in weather forecasting (c.f. Hall andCacuci 1982 andErrico 1997). We expect the prime differ ence between a climate adjoint and a weather adjoint to be that the parameters of the model are adjusted instead of the initial atmospheric state. The development of a climate adjoint is a very new field and everything remains to be done.

TESTING MODELS USING SECOND-MOMENT STATISTICS
Climate signal detection clearly relies heavily on climate models: first (and foremost) to prescribe the natural variability of the climate in the variables of interest, and also to specify the patterns of the forced signals for which we are searching. In order to get a sense of how reliably a climate model can describe the climate variability, it is necessary to use another method to test the climate model. Leith (1975) proposed a method which is ideally suited to this task. His idea is based on the application of the fluctuation-dissipation theorem of thermo dynamics to the ocean-atmosphere system.
The outcome of his note is that the quality of a model is best tested by comparing second moment statistics of its variables. Most climate model validation efforts involve a comparison of model averages and observed means (Houghton and Meira Filho, 1996). In the event of a mismatch it is customary to tune the model to match model output and observations regardless of the physical reality of the climate system. By paying attention to second-moment statistics of climate variables, though, one is more likely to realistically simulate the actual physics of the climate system, even if long-term averages may not necessarily match reality. Any mis match of averages then represents the absence of a physical process in the model rather than an implementation of incorrect physics. Hence, we regard a model which is validated by a com parison of second-moment statistics as a more useful tool for prediction than one whose first order moments are compared.
We first describe Leith's note qualitatively. Then we describe the work of some authors who have used this idea to test climate models, including some work which uses second moment statistics of observed field to estimate internal fe edbacks in the climate system. Fi nally we describe how COSMIC data may be useful for second-moment climate model test ing.
4.1 Leith's Note Leith (1975) sought to show that there is a link between the second-order moments of a climate system and its sensitivity to external forcings. He applied the fluctuation dissipation theorem, even though the theorem holds only for Liouville sytems (Kraichnan 1959). The climate system is not Liouville, but subsequent work demonstrated that the theorem remains applicable to the climate system to a large degree (North, Bell, and Hardin 1993). An heuristic explanation of Leith' s work is presented here. Consider the climate system as characterized by a large set of variables ucx. We wish to find how those variables respond to an external forcing of a subset of those variables. The response as a function of time can be written in terms of a Green's function: (34) in which 8ucx is the response of variable ucx to an external forcing qr13 in variable u 13 and g cxf3 ( r ) is the response function.
Again, assume the system is Liouville (energy-conserving) with its variables fluctuating randomly, each fluctuation of a mode leading to responses in all the other modes through weak interactions. All of the modes are in statistical equilibrium, like an equipartition of energy in thermodynamics. One can then think of a flutuation lff/3 leading to a fluctuation 8ucx through a weak coupling, and one finds that , r _ (ou;(t)fo�(t -' r)) gw ( ) -(8u� 8u�) ' (35) in which ou ; is the amplitude of the fluctuation of a mode r' each mode being a linear combination of the variables ucx and is orthogonal to every other mode. The denominator of equation 35 can be thought of as a forcing by the random fluctuations of mode a and the numerator as the response to that forcing. When rewritten in terms of the variables of the system, equation 35 becomes in which U ( r) is the time-lagged covariance matrix of the atmospheric variables. Because the time-lagged covariances can be computed from observations, it seems that the climate sensitivity can be derived from observations alone. Nevertheless, it is impractical to compute the climate sensitivity this way because of the number of variables involved. On the other hand, it is clear that it is the second-moments of the climate system which determine its sensitivity, and thus it seems important that the second-order moments of a predictive model agree well with reality.

Second-moment Testing Using COSMIC
As mentioned earlier, testing climate models according to second-moment statistics is still a nascent activity, and consequently it is not yet clear where it will lead. We expect that this work can be divided according to the timescales of the statistics of interest. Statistics of individual soundings over the course of a month will be relevant to atmospheric processes, because timescales internal to the atmosphere are less than a month. Statistics of monthly averages over the course of several years will be relevant to the entire climate system, includ ing atmosphere, ocean, biosphere, and cryosphere. Here we describe what we might find fol lowed by its implications.

Short Timescale Statistics
When examining the second-moments on short timescales, one is sensitive to processes which occur on shorter timescales. If the timescale is on the order of weeks, then the processes being tested are primarily atmospheric as opposed to oceanic and pertain to weather phenom ena. We see Haskins et al. (1999) (HGC) as a prototype for this work. HGC calculated statis tics of spectral radiances obtained by the IRIS instrument in 1970/1971. They calculated their statistics on a variety of sub-monthly timescales and compared to the statistics of radiances produced by an atmospheric general circulation model (AGCM). They found that radiance statistics are dominated by cloud conditions; consequently, their comparison revealed some thing concerning the quality of parameterizations related to cloudiness in the AGCM. Some what less significantly, the statistics were also relevant to surface temperature processes and water vapor processes.
We anticipate that occultations using COSMIC will reveal a great deal about processes related to the water vapor cycle in the lower troposphere. Recall that the microwave refractiv ity of the neutral atmosphere is given approximately by where p is the atmospheric pressure, Pw is the partial pressure of water vapor, Tis the tempera ture, a= 77 .6 K mbar-1 and b = 3. 73 X 105 K2 mbar1• Whereas refractivity is still dominated by the "density" term (the first term on the right) even in the lower tropical troposphere, every where the variability of the refractivity is dominated by the "water vapor" term (the second term on the right) in the lower troposphere. In fact, the variance of the "water vapor" tenn is about two orders of magnitude greater than the variance of the "density" term near the surface. The water vapor term falls off with increasing height much more strongly than the density term. At what point the variance of the density term begins to dominate can be estimated by an examination of the statistics of radiosonde data or weather analyses.
It is simple enough to calculate the covariance of integrated density N(h, x) profiles in regional bins. If we define a set of N; as N;(h;,X) for soundings where x ER where R defines a region, such as the Central Pacific Ocean, the Indian Ocean, the Sahara Desert, etc. The application of the fluctuation-dissipation theorem comes in comparing the EOFs of the data to those of the output of an AGCM. AGCMs all have parameterizations which are tuned so that their mean states match climatological mean states. The fluctuation-dissipation theorem implies, though, that the statistics are more indicative of the predictive capabilities of a model than the mean state of the model; so, a model whose EOFs agree well with those in data can be relied upon to give better predictions of future climates than a model whose mean state agrees well with data.
It is probably appropriate to choose regions which are thought to have a distinctive atmo spheric process at work. For example, any second-moments calculated in the tropics will be directly relevant to moist convection and indirectly relevant to all those processes related to moist convection, including ocean-atmosphere interaction, cloud formation and radiative pro cesses. Second-moment statistics in the midlatitude storm bands will be directly relevant to horizontal mixing processes and stratosphere-troposphere exchange.
As an example, a second-moment of interest would be the correlation between refractiv ity at the surface vs. refractivity at altitude. It is hypothesized by Lindzen (1995 ) (and refer ences therein) that humidity in the upper troposphere is anticorrelated with humidity in the lower troposphere, making a negative feedback to increased radiative forcing at the surface. A second-moment statistic related to such a process would be the correlation between surface specific humidity and upper tropospheric humidity. In Fig. 5 we show such a correlation for two regions as exists in the 300-year control run of CCM3 (Boville and Gent 1998). While specific humidity is not directly retrievable from atmospheric refractivity, its contribution to refractivity variance in the lower troposphere is about 100 times larger than the contribution to refractivity of the dry air. Thus, vertical correlations of refractivity from occultation data should inform us mostly on water vapor physics and transport in GCMs.

Longer Term Statistics
When examining second-moments on timescales longer than several weeks, the physical processes relevant to the entire climate system-atmosphere, ocean, cryosphere, biosphere become relevant. Such second-moments we equate with natural variability of the climate. We see the work of Polyak and North and colleagues as a good example of model testing by statistical analysis on long timescales. Their work is described below.
Recently, Polyak et al. (Polyak 1996;Polyak and North 1997;Polyak and North 1997b) have described a method which is linearized version of the Leith problem. They write a linear model of the natural variability of the zonally averaged surface temperature field T; for diffe r ent latitude belts i. : dT L -1 + A.T. = e.(t) . dt . lj J I J (38) One adapts the model to data by finding linear coefficients Au using optimal statistical meth- ods for both data and control runs of models and comparing. Since the records are short one must assess statistical significance of the differences. It is reckoned that the values of the coefficients are very important since in principal the eigenvalues of A approximate the feed backs in the system if it is perturbed by time independent (static) external influences. Such a static perturbation (e.g., doubling C0 2 ) is a standard measure of the sensitivity of a model. For example, let there be an external perturbation !!J,, Q j , then /!J,, Tj oc L { A-1 } ij/!J,, Q j .
(39) j The coefficients are estimated from data streams and data generated by a GCM. They are sensitively dependent upon the lagged space-time covariances in these two streams. If the model does not get these coefficients right it is not likely to get the sensitivity right. Hence, we feel that getting these coefficients right is a necessary condition for model success in simulat ing climate change.
While the cited papers open the door to studying this problem, much has to be done to improve our understanding of the process. First of all the studies mentioned do not take into account the fact signals are contained in the data stream. These induce correlation structure that is not in the unforced control runs. One must either add the signals to the model runs or remove some approximation to them from the data stream. Secondly, it is not clear how many variables are required to adequately describe the internal feedbacks of the system. For in stance, one is clearly interested in sea surface temperatures and zonal winds in the tropical Pacific Ocean if any feedback related to El Nifio is to be understood. Dealing only with surface air temperatures may not suffice. Such work is yet to be done.
We expect that COSMIC will significantly augment our knowledge of the variability of the climate system by providing data on upper air geopotential heights (Leroy 1997). For example, a quantity of particular interest obtainable by COSMIC are spatial correlation of upper tropospheric and stratospheric integrated refractivities in polar latitudes. The arctic os cillation (Thompson and Wallace 1998;Shindell et al., 1999) ought to be easily detected in such hemisphere-wide correlations. Furthermore, COSMIC offers the possibility of finding a similar oscillation in the southern hemisphere where upper air sounding is much less frequent than in the northern hemisphere.

A Climate Adjoint?
Inevitably, the statistics of the data and of the model will disagree at some level, at which point one would like to know how to correct a model parameterization so that model and data statis tics agree better. It is not yet clear how to do this. It is thought (Goody et al.,1 998) that a climate model adjoint, similar to those used in data assimilation in numerical weather fore casting but different in that it is model parameters which require adjusting, is needed. Such a climate adjoint has not yet been developed for any AGCM. Until a climate adjoint is devel oped, different parameterizations can be rated according to how the second-moments of their implementation compare to the second-moments of the data.