Assessment of the cloud liquid water from climate models and reanalysis using satellite observations

We perform a model-observation comparison and report on the state-of-theart cloud liquid water content (CLWC) and path (CLWP) outputs from the presentday global climate models (GCMs) simulations in CMIP3/CMIP5, two other GCMs (UCLA and GEOS5) and two reanalyses (ECMWF Interim and MERRA) in comparison with two satellites observational datasets (CloudSat and MODIS). We use two different liquid water observation products from CloudSat and MODIS, for CLWP and their combined product for LWC with a method to remove the contribution from precipitating and convective core hydrometeors so that more meaningful modelobservation comparisons can be made. Considering the CloudSat’s limitations of CLWC retrievals due to contamination from the precipitation and from radar clutter near the surface, an estimate CLWC is synergistically constructed using MODIS CLWP and CloudSat CLWC. The model-observation comparison shows that most of the CMIP3/CMIP5 annual mean CLWP values are overestimated by factors of 2 10 compared to observations globally. There are a number of CMIP5 models, including CSIRO, MPI, and the UCLA GCM that perform well compared to the other models. For the vertical structure of CLWC, significant systematic biases are found with many models biased significantly high above the mid-troposphere. In the tropics, systematic high biases occur at all levels above 700 hPa. Based on the Taylor diagram, the ensemble performance of CMIP5 CLWP simulation shows little or no improvement relative to CMIP3. Article history: Received 25 January 2018 Revised 25 June 2018 Accepted 4 July 2018


IntroductIon
Representing clouds and cloud climate feedbacks in global climate models (GCMs) remains a pressing challenge to reduce and quantify uncertainties associated with climate change projections (IPCC 2007(IPCC , 2013. Vertical structures of clouds simulated by present-day models have not been extensively examined using vertically-resolved cloud hydrometers such as ice water content (IWC) and liquid water content (LWC). The rudimentary work done thus far suggests significant biases that the Intergovernmental Panel on Climate Change (IPCC) models have in comparison to observational products in ice water path (IWP) (Waliser et al. 2009) and LWP values (Li et al. 2008. There is a wide disparity in the cloud ice water path (CIWP) (Waliser et al. 2009;Li et al. 2012) and cloud liquid water path (CLWP) (Li et al. 2008) among the Coupled Model Intercomparison Project phase 3 (CMIP3) models and the Coupled Model Intercomparison Project phase 5 (CMIP5) models.
Clouds strongly influence global climate through their effects on the Earth's radiation budget (e.g., Randall and Tjemkes 1991). The importance of low clouds cannot be overstated as "cloud feedbacks remain the largest source of uncertainty" in determining Earth's equilibrium climate sensitivity, specifically to a doubling of carbon dioxide scenario (IPCC 2007(IPCC , 2013. Some evidence for this uncertainty are given in Fig. A1 which illustrates a considerable model-to-model disagreement in the CMIP3 liquid water path (LWP; g m -2 ) in the GCM simulations contributed to the 4 th Intergovernmental Panel on Climate Change (IPCC) Assessment Report (20c3m scenario)  and Terr. Atmos. Ocean. Sci., Vol. 29, No. 6, 653-678, December 2018 the 5 th Assessment Report (AR5) (Boucher et al. 2013). As a matter of fact, LWP/LWC products are derived from observing systems that possess very different characteristics, such as different sensitivities to cloud and precipitation with different physical assumptions employed in the retrieval process (e.g., Stephens and Kummerow 2007). Despite significant efforts to derive LWP measurements from passive and nadir-viewing techniques, the large optical thicknesses, multi-layer structure, and mixed-phase nature, including the presence of precipitating hydrometeors (e.g., drizzle), of many clouds make the estimates from these techniques very uncertain (e.g., Stephens et al. 2008) and therefore cannot be trusted under a precipitating condition and is removed from this study (see section 2). The ramifications of the poor constraints for cloud water mass, even in terms of total water path, are evident in the model-to-model disagreement for globally-averaged cloud LWPs shown in Fig. A1a. As expected, these differences are exacerbated when considering the spatial patterns of the time-mean values shown in Fig. A1b. The significant disagreement between models for such a fundamental quantity, that has important effects in the context of climate change, must be reduced to improve future model projections of climate.
Before CloudSat was launched in 2006, global observations of cloud water, particularly the vertically-resolved cloud liquid water content (CLWC; mg m -3 ), were not readily available for model development and validation. The CloudSat mission now provides a considerable leap forward in the information gathered regarding tropospheric cloud mass as well as other macrophysical and microphysical properties (e.g., Stephens et al. 2008). CloudSat's cloud profiling radar capabilities provide a new view of the global and vertical structure of clouds, in particular, the vertical structure of cloud condensate. It is worth noting that, for both the passive and active satellite retrievals and for some models, it is understood that "liquid water content" (LWC) should represent all liquid hydrometeors, and should include suspended cloud liquid and liquid mass in precipitating forms such as rain or drizzle. However, these observations are an altogether new resource, albeit with uncertainties and limitation (e.g., Li et al. 2008). Li et al. (2012) pointed out that, for improving GCM representation of clouds and cloud climate feedback, considerable care and caution have to be taken in order to make judicious comparisons between the GCM representations of (typically only the) clouds and the satellite observations that are an inherent combination of the clouds and falling hydrometeors (e.g., rain or drizzle or cloud ice or falling snow). Such considerations include taking steps to make a sensible comparison for CLWC or CLWP and cloud ice water content (CIWC) or path (CIWP) (e.g., Li et al. 2012) or taking the steps needed for a viable comparison in terms of reflectivity or radiance, which typically means to use satellite "simulators" (e.g., Klein and Jakob 1999;Webb et al. 2001;Delanoë and Hogan 2010;Bodas-Salcedo et al. 2011;Delanoë et al. 2011). However, such distinctions are often not clearly made, and certainly not always made consistently between satellite retrievals, model parameterizations and/ or output from models. Because there is no precipitating liquid particle included as a prognostic parameter in most GCMs, the direct comparison with observation may not be appropriate.
In this study, we take the approach in Li et al. (2012), and perform the evaluation in terms of the model representations of CLWC/CLWP utilizing the experience we have gained from cloud ice and liquid (e.g., Li et al. 2005Li et al. , 2007Li et al. , 2008Li et al. , 2011Li et al. , 2012Waliser et al. 2009). This includes developing a measure of observational uncertainty (discussed in section 2), and applying an illustrative and quantitative set of evaluation diagnostics. In particular, a new set of CLWC is acquired by the construction of vertical profiles of LWC from MODIS LWP combined with CloudSat LWC. The augmentation with MODIS based LWC with CloudSat LWC is to overcome the limitations of CloudSat-derived LWC/IWC retrievals from radar clutter near the surface (lower two to four bins radar signal) and/or in the present of precipitation. A prominent goal of the study is to examine how the fidelity of the models may have changed from CMIP3 to CMIP5. Moreover, we attempt to discriminate CMIP5 models that achieve a threshold capability of model fidelity by using the Taylor diagram (Taylor 2001) and by taking into account the observational uncertainties (see section 2). In addition, as reanalysis products have become nearly synonymous, in some contexts, with "observations", we also incorporate two recent reanalysis products in our study to provide some assessment of this tenuous perception -particularly for quantities such as CLWP and CLWC that are not strongly constrained by observations.
In sections 2 and 3, we describe the observational resources used in this study, including their retrievals and the methodologies to obtain an observational estimate with some quantitative assessment on uncertainty. In section 4, we briefly describe the models and reanalysis data sets utilized in this evaluation study. In section 5, we illustrate and discuss the results of our model evaluation. Section 6 summarizes the results and draws conclusions.

observed estImAtes of LWc And LWP
The cloud liquid water observations employed in this study are derived from visible and the near infrared passive observations of the A-Train MODIS and the active space borne radar observations of the A-Train CloudSat (Stephens et al. 2002). Each LWP product considered are derived from observing systems that possess very different characteristics, such as different sensitivities to cloud and precipitation, dynamic range, representativeness and different physical assumptions and other model data employed in the retrieval process (e.g., Stephens and Kummerow 2007). As a consequence, the liquid water content information inferred from each system is distorted by the retrieval process itself and interpretation of these products and their comparison warrants a note of caution. Despite the differences, under idealized conditions (e.g., warm, homogeneous, single-layered, fully overcast clouds with no precipitation) there is a high degree of agreement among the products used at least over the range of liquid water path between 20 g m -2 and about 200 g m -2 (e.g., Rossow 1994, 1997;Borg and Bennartz 2007).

modIs cloud LWP
LWP data used are from the MYD06_L2 product. The MODIS LWP is estimated from cloud optical thickness and droplet effective radius, which are inferred from solar reflectance at 0.64 μm visible band and at one of the water-absorbing near-infrared bands located at 1.6, 2.2, and 3.7 μm (Platnick et al. 2003). The MODIS measurement is based on reflected sun light both over land and ocean and only available for daytime. Details of uncertainties and limitations for MODIS LWP retrieval are described in Appendix A section 1.
The MODIS LWP retrievals are collocated to the CloudSat profiles by finding the nearest neighbor MODIS measurement for each CloudSat footprint location. See Appendix A section 2 for details of the co-location data generation process. The collocation was performed in order to apply the CloudSat-based precipitation and cloud condition flags to the MODIS LWP data to partition these data into the suspended (cloud-only), convective and precipitating portions. The time period of this data used in this study is from May 2008 to April 2010.

cloudsat cPr LWP and LWc
CloudSat provides vertical profiles of radar reflectivity measured by a 94 GHz cloud profiling radar (CPR) with a minimum sensitivity of ~ -30 dBZ. The profiles extend between the surface and 30 km altitude with a vertical resolution of 240 m and having a footprint of about 2.5 km along track and 1.4 km cross track. To date, two official retrieval products for LWC/LWP are available from the CloudSat data processing center: 2B-CWC-RO and 2B-CWC-RVOD. While 2B-CWC-RO uses only measured radar reflectivity, 2B-CWC-RVOD uses both the radar reflectivity and the visible optical depth retrieved from CloudSat and MODIS measurements together ). For this study, the 2B-CWC-RO4 ) data are used and the detailed sensitivity and uncertainty of this retrieval algorithm are discussed in Austin et al. (2009). The time period of the data used in this study is from January 2007 to December 2010. The interpretation of reflectivity as cloud liquid water is straightforward in the absence of precipitation. But when present larger particle such as precipitation, it grossly distorts the estimate of LWC due to the high sensitivity of radar reflectivity to the presence of large particles. Other shortcomings of the CloudSat data include (1) the effects of ground clutter that mask the lowest kilometer and thus the liquid water content of a significant fraction of low clouds is undetected, (2) the ambiguity of mixed phased clouds and deep convection on radar reflection makes interpretation of reflectivity in terms of liquid water content problematic.

synergIstIcALLy constructed LWc usIng modIs LWP And cLoudsAt LWc
Each of the different cloud liquid water data products described above has particular limitations especially when precipitation is present. For a meaningful comparison between the satellite-estimated and model-simulated LWP, the LWP observation information derived for convective/ precipitating liquid water clouds should be rejected and removed. Convective clouds and precipitation were identified using an approach referred to as the FLAG method (Li et al. 2008Waliser et al. 2009) that rejects all the retrievals in any profile that are flagged as precipitating (and drizzle) at the surface (from CloudSat 2C-PRECIP-COLUMN data) and exclude any retrieval within the profile whose cloud type is classified as "deep convection" or "cumulus" (from CloudSat 2B-CLDCLASS data). By excluding these portions of the liquid mass, we obtain an estimate of the cloud-only portion of the LWP/LWC (hereafter, referred to as CLWP/CLWC). This methodology of estimating CLWP/ CLWC and CIWC/CIWP was used in our previous CMIP3 model-data comparisons on LWC and IWC (e.g., Li et al. 2008Li et al. , 2011Li et al. , 2012Waliser et al. 2009).
MODIS and CloudSat CLWP are further filtered out mixed-phase cloud condition because its partitioning of cloud water into liquid and ice is unreliable as described above. For CloudSat, a collocated ECMWF temperature profile (from CloudSat ECMWF-AUX data) is used to determine whether a given cloud water content profile contains mixed-phase clouds. If the temperature profile contains a section with temperature higher than -20°C and lower than 0°C, we use the CloudSat LWC retrieval algorithm to partition the cloud water content into liquid and ice. Since the reliability of the partitioning method is questionable, we filter CloudSat retrieved LWC profiles that contain a section with temperature higher than -20°C and lower than 0°C. For MODIS, the cloud phase is determined by utilizing distinct differences in bulk absorption characteristics between water and ice at infrared wavelengths. This method allows MODIS to distinguish supercooled clouds from mixed-phase clouds or ice containing clouds. Because of this distinction, the MODIS CLWP filtered for no-mixed phase cloud condition captures the supercooled liquid water path, while the CloudSat CLWP filtered for no-mixed phase cloud condition does not.
The filtered CLWP of MODIS and CloudSat are shown in Figs. 1c and d. Under these conditions of non-convective, non-precipitating clouds, the liquid water path data from MODIS and CloudSat broadly agree with each other except in regions of low stratiform clouds and mixed-phase clouds (mid-latitude storm tracks and high latitudes). Also shown in Fig. 1 is the MODIS LWP retrieved under all conditions (Fig. 1a) and under the conditions of precipitation and convective clouds (Fig. 1b). Since CloudSat does not provide meaningful information in the presence of precipitation, only data from MODIS are shown. Since MODIS is relatively insensitive to precipitation, the difference between the non-precipitating and precipitating LWP shown is interpreted as a measure of the increased cloud LWP associated with precipitating clouds. Precipitating clouds tend to be deeper and contain more cloud liquid than non-precipitating clouds ). The small difference between MODIS all condition LWP (Fig. 1a) and MODIS filtered LWP ( Fig. 1c) confirms that MODIS is relatively insensitive to precipitation. Figure 2 shows annual, zonally averaged LWP quantities obtained from the three observational estimates: (1) MODIS all condition LWP, (2) MODIS non-convective non-precipitating no-mixed phase cloud LWP, (3) CloudSat non-convective non-precipitating no-mixed phase cloud. The CLWP estimates, in general, agree relatively well between CloudSat and MODIS in the tropical and subtropical regions but they differ significantly in the mid-and high-latitudes due to the differences in the corresponding no-mixed phase cloud conditions as mentioned above.
Another observational reference we prepare for this study is a liquid water content profile that is synergistically constructed using MODIS LWP and CloudSat LWC. As illustrated in Figs. 1 and 2, CloudSat filtered CLWP is a small portion of MODIS filtered CLWP globally (12.7 g m -2 versus 35.8 g m -2 in global averages) and especially in high latitudes and in stratocumulus regions because it missed supercooled clouds and low topped clouds. Figure 1 also shows that the contribution from precipitating clouds is expected to be about 10% of the cloud liquid water path in all conditions (5.4 g m -2 out of 41.1 g m -2 in global averages). Therefore, CloudSat LWC by itself is not sufficient enough to provide the vertical profile of CLWC for the comparison with models. We complement the CloudSat LWC with LWC derived from MODIS LWP values in the CloudSat missing conditions. Namely, we construct CLWC for precipitating clouds, LWC for low-topped clouds, and LWC for supercooled clouds. All of the clouds are missed in CloudSat filtered LWC we discussed above.
In order to construct LWC from MODIS LWP, we need to define the cloud top height, cloud base height, and cloud water content vertical structure. We use the cloud top height determined by CALIPSO lidar and CloudSat radar combined retrieval algorithm (from CloudSat GEOPROF-LIDAR data). Details for finding LCL are described in Appendix A section 4.
We select cloud pixels that are not included in the CloudSat CLWC filtering condition. For the cloud pixels, we use MODIS LWP to construct LWC using the method described above. The MODIS derived LWC are decomposed into several conditions so that the relative contributions from the different conditions can be quantified. The MODIS LWC data are first divided into two groups: (1) low-topped clouds with the cloud top height less than 1 km and (2) mid-topped clouds with the cloud top height larger than 1 km. This division is to test how much the low-topped clouds are missed by CloudSat and the contribution of the low-topped clouds to the overall CLWC. The mid-topped clouds are further divided into two groups: (1) precipitating clouds and (2) non-precipitating clouds. This division shows the relative contribution from precipitating clouds, which are less well constrained by both MODIS and CloudSat satellite observations than non-precipitating clouds. The uncertainty of the overall CLWC is largely affected by the amount of water content from the precipitating clouds. Knowing the relative contribution of the precipitating clouds even though the absolute values are not reliable, that is very helpful in the uncertainty estimation.
Note that we use MODIS as complementary data to CloudSat by adding MODIS data when CloudSat data are not available or unreliable. This means that the three MODIS cloud conditions considered are clouds that are not detected or reliably retrieved by CloudSat. Therefore, each of the three MODIS cloud conditions is not the same as all clouds detected by MODIS in that condition. For example, MODIS non-raining mid-level clouds in this study does not represent all MODIS detected non-raining mid-level clouds. It instead represents MODIS detected but CloudSat non-detected/non-retrieved non-raining mid-level clouds. Figure 3a shows the annual mean and global average of LWC quantities obtained with the method described above. In order to see the relative contribution of the different cloud conditions to the overall CLWC, the CLWC of each cloud condition is separately plotted. Globally, the contribution of CloudSat CLWC is about 25% of the total estimated CLWC (black line, the sum of all the CLWCs from the subgroups). The low cloud contribution missed by CloudSat is significant near 900 hPa. The contribution from precipitating clouds is relatively small but is a measurable amount that can give a systematic negative bias to the observational estimation if neglected. Finally, the non-precipitating clouds that are missed by the CloudSat filtered CLWC contribute significantly. The non-precipitating clouds are mainly composed of the supercooled clouds that CloudSat cannot retrieve well because of its retrieval algorithm of artificially partitioning cloud water contents into liquid and ice when the temperature is between -20 and 0°C, even though the clouds are supercooled. Figure 3b shows the annual, zonal mean of relative contributions of LWC quantities by comparing the corresponding LWP values in the subgroups. Figure 3c shows the relative occurrence frequency (ROF) of the cloud subgroups. All the three MODIS cloud subgroups show the increased contribution to liquid water path (therefore content as well) in the higher latitudes. This is a direct consequence of the no mixed-phase condition used to filter the Cloud-Sat clouds. The largest contribution missed by CloudSat is no-rain mid-level clouds, followed by low clouds, and rain mid-level clouds. Comparing Figs. 3b and c, the order of the relative size of the liquid water path does not always follow the order of its ROF. For example, the CloudSat cloud ROF is lower than the MODIS no-rain mid-level cloud; the CloudSat cloud LWP is larger than the MODIS no-rain midlevel cloud LWP in the tropics. This suggests that the norain mid-level clouds missed by CloudSat are mainly tropical low-value LWP clouds.  Overall, it is very clear that the complementary use of CloudSat and MODIS is critical in estimating the LWC because many subgroups of clouds are not well retrieved by CloudSat alone. The contributions of the clouds missed (meaning either non-detected or not retrieved reliably) by CloudSat are not negligible in all latitudes and especially at high latitudes. The method we used to construct the liquid water content using MODIS is not perfect because the information available in these conditions is limited (i.e., no direct information about vertical distribution of the cloud mass). However, the zero-th order estimation of LWC from all cloud types is a useful observational reference to use for comparisons with models.
Apart from the uncertainty of the retrieval method, an additional uncertainty to consider in light of making modelobservation comparisons concerns the differences in the spatial and temporal sampling between the observations and the GCMs, such as those in the CMIP archives. Li et al. (2012) and Guan et al. (2013) found that the bias introduced by the satellite sampling of cloud water is negligible, which is within 3% of the standard deviation of the unsampled data. It is plausible to compare the observed, satellite-sampled, liquid water estimates to those from the GCMs without the need to sample the GCMs along the A-Train satellite track (cf. Jiang et al. 2012).

modeLed vALues of LWc And LWP
On the modeling side, LWC is usually a prognostic variable based on a balance equation contributed by largescale advection and parameterizations of subgrid-scale convective cloud, shallow cumulus, and stratocumulus. LWP is obtained as the vertical integral of LWC. The models examined in this study, except for GFDL-CM3, do not include liquid water mass from precipitating rain and/or convectivetype clouds in their LWC. Therefore, we consider the model LWP/LWC as CLWP/CLWC. Following Li et al. (2012) and using the observations described in section 2, we evaluate CLWP/CLWC in ECMWF (ERA-Interim, Dee et al. 2011) and NASA MERRA reanalyses, coupled atmosphere-ocean GCMs (CGCMs) from CMIP3 (for CLWP only), CGCMs from CMIP5, and two additional state-of-the-art GCMs: the UCLA GCM ) and the NASA GEOS5 GCM. The CMIP3 simulations are the same as those described in Li et al. (2008Li et al. ( , 2011Li et al. ( , 2012) -although excluding the two UKMO models which we have learned that the provided output on CLWP was incompatible with the CMIP3 output specifications (cf. Li et al. 2011). In CMIP3 and CMIP5, the intended meaning of "clwvi" is total water path, i.e., ice plus liquid. Our investigation referred to LET) has determined that thirteen of the CMIP3 GCMs (labeled in LET as: ccc_ma63, cnrm, csiro, gfdl, iap, ipsl, mirochr, gisseh, gisser, inmcm, mpi, ukmogem, and ukmocm) provided output that was consistent with the intended interpretation [i.e., the total water path (TWP)], while three of the GCMs (labeled in LET as: bccr, csiro, ncar) provided the output with the interpretation that the quantity was just associated with cloud liquid water path (i.e., CLWP). Two GCMs output from ukmocm3 and ukmogem, the total water path (TWP) to all CMIP3 did not include water associated with the convection scheme. In CMIP5, twelve GCMs (bcc, bccesm, CanESM2, in-mcm4, inmcm4_esm, CNRM, GISSE2R, GISSE2H, MRI, NorESM1, GFDL-CM3) provided output that was consistent with the intended interpretation of "clwvi" (i.e., TWP), eight of the CMIP5 GCMs (CCSM4, CSIRO, IPSL, MPI, MI-ROc4h, MIROC5, MIROC-ESM, MIRO-ESM-CHEM) provided the output with the interpretation that the quantity was just associated with cloud liquid water (i.e., CLWP). Given this situation, we derived unfiltered CLWP for those models in CMIP3 and CMIP5 that provided TWP using the following relationship: LWP = TWP -IWP (Cloud Ice Water Path). Table 1a lists the CMIP5 simulations included in this study, and Table 1b describes a summary of cloud microphysics parameterizations used in the selected CMIP5 models. As mentioned in Li et al. (2012), the performance of simulated warm cloud properties in CMIP5 models arises from a highly coupled system (land, ocean, atmosphere etc.) and the behavior is not likely to be simply explained by any single component/scheme, but rather by details of the model's specific schemes and the coupling among schemes related with a particular process such as clouds, aerosols and turbulence for boundary layer clouds, and clouds and convection for deeper clouds, as well as the interactions with sea surface temperatures (SSTs). However, we attempt to explain the causes in the behavior of some of the best/worst performing models in section 5.
The specific experimental scenario is the historical 20 th century simulation, which used the observed 20 th century greenhouse gas, ozone, aerosol, and solar forcing. The time period used for the long-term mean is 1970 -2005, and if a model provided an ensemble of simulations, only one of them was chosen for this evaluation.
One cautionary remark for the model-observation comparison is regarding the difference between the water mass from cloud particles in precipitating conditions and the water mass from precipitating particles in precipitating conditions. These two water mass sources are different in terms of contributing particle sizes. In the observation side, both water mass sources are removed because any retrievals from the precipitating condition are filtered out in the FLAG filtering method described in section 2. In the model side, all of the models examined in this study include the water mass from cloud particles in precipitating conditions but only the two GFDL models include the water mass from precipitating particles in addition to that from cloud particles.
Ideally, for the comparison with the non-precipitating CloudSat and MODIS observational estimates, precipitating profiles from the model should also be excluded. However, at this stage it is not feasible because model precipitation is parameterized and "averaged" in time and grid box (contrast to a snapshot CloudSat cloud profile) over a physical model time step (~30 mins to hour) with significantly coarser resolution (~50 -100 km) than the size of the instantaneous "snapshot" CloudSat footprint (~1 -2 km) or MODIS pixel (1 km). Thus, in a GCM, the determination of a threshold for non-precipitation and/or convective mass that is equivalent to CloudSat footprint/MODIS pixel resolution is difficult.
Given this situation, we use the filtered MODIS and CloudSat observational estimates of LWP as a lower limit. At the same time, the MODIS unfiltered CLWP can serve as a upper limit since it includes cloud liquid water (i.e., no precipitating particle) in precipitating conditions but MODIS retrievals in the condition are known to be less reliable and overestimate LWP in comparison with in situ measurements (King et al. 2013 and see section 2 for discussion). Additionally, we have combined the MODIS and CloudSat cloud retrievals to construct the CLWC profile for all conditions (see section 2 for discussion), which include cloud water profiles in the precipitating conditions. These estimates should be used with caution because the uncertainty/errors in the retrievals from the precipitating conditions is large (> 100%).
Unlike all the other models examined in this study, which do not include liquid mass from precipitating rain and convective-type clouds in their CLWC, the two GFDL models include the liquid mass by adding grid means over shallow cumulus, deep cumulus cells, and convective mesoscale clouds, weighted by their respective area fractions. Thus, the GFDL models should be considered somewhat carefully with respect to the other models, and their CLWC/ CLWP fields would be more commensurate with the total liquid water content (TLWC)/ and total liquid water path (TLWP), which include water mass from both cloud particles and precipitating particles. Currently, we do not have reliable global observational datasets to estimate TLWP because available observation instruments are either sensitive to only cloud particles (as in MODIS) or are more sensitive to the precipitating particles so that the contribution from cloud particles is not retrievable reliably (as in CloudSat).
For both the GCM and observational data sets, all fields have been re-gridded to 40 levels in vertical (with a constant pressure interval of 25 hPa) and mapped onto common 8° × 4° longitude by latitude grids.
ERA-Interim (CY31r1) Mixing ratio of cloud condensate Bulk single moment; mixing ratio of cloud condensate with temperature dependent partitioning (The bounds are adjustable constants with current settings ice at T = -15°C and liquid at T = 0°C).
GEOS5 Single mixing ratio of cloud condensate Bulk single moment; mixing ratio of cloud condensate with temperature dependent partitioning; "anvil" cloud, originates in detraining convection. "Large-scale cloud", originates in a probability distribution function (PDF) based condensation calculation.

resuLts
Overall, the multi-model mean CMIP5 CLWP values (Fig. 4s) are similar to observations in terms of spatial distribution, but they are biased high globally even when compared against the MODIS unfiltered CLWP. Individually, most models tend to qualitatively capture the global and regional CLWP patterns. This includes the relatively high values of CLWP in the storm tracks from the subtropics to high latitudes. However, they all overestimate CLWP over ITCZ and warm pool. Note that the relative magnitudes between tropical and mid-latitude values can be quite different across models.
None of the CMIP5 models provides a good representation of both the magnitude and spatial pattern of CLWP. The three that perform relatively better are CSIRO, Inmcm4, and Inmcm4esm but even these have significant shortcomings relative to the observation data. Most of the models overestimate (~a factor of 2 or more) tropical CLWP. The MIROC, MIROCCHEM, and CCSM4 GCMs greatly overestimate (~a factor of 5) tropical and storm track CLWP. The IPSL, CSIRO, MIROC5, MIROC4h, and the two GISS GCMs moderately overestimate CLWP in the extra-tropics. For the non-CMIP5 GCMs, the GEOS5 atmospheric GCM (AGCM) overestimates (~a factor of 2) CLWP in the storm tracks while the UCLA GCM does relatively well over most of the globe except over the northeast of South America. The two analyses, ECMWF and MERRA, show relatively good spatial patterns of CLWP patterns, with both being biased high (factor ~1 -2). The GFDL model simulates and provides output on TLWP. When compared with the observational estimates of CLWP (Figs. 4y -aa), GFDL exhibits in a relatively good agreement in the extra-tropics storm track regions but its TLWP is larger than the observational CLWP in the tropical ITCZ and warm pool. Since the observational CLWP does not include water mass from deep convective cores and precipitating hydrometeors, it is uncertain whether it indicates the model biases of GFDL TLWP estimations. Figure 5 shows the long-term annual zonal average of CLWP quantities associated with CMIP5 displayed in Fig. 4. Figure 5a represents the multi-model mean (blue line), the one standard deviations for upper and lower bound (red line). The latitudinal distributions clearly illustrates a wide spread of CLWP in the CMIP5 models. Figure 5b shows the observational estimations of CLWP from Cloud-Sat and MODIS. The green/black line is the MODIS unfiltered/filtered CLWP which includes/excludes the cloud water mass from both convective cores and precipitating clouds. The magenta line is the CloudSat filtered CLWP which excludes the cloud water mass from convective cores and precipitating clouds. The multi-model mean (the blue line) and the model standard-deviation added and removed to the mean (the red lines) are also plotted for comparison with the observational estimates. It is evident that the multimodel mean CLWP is one standard deviation larger than the observational MODIS unfiltered CLWP and filtered CLWP. Since the MODIS unfiltered CLWP is the upper limit of the observational CLWP, this model-observation comparison illustrates the model CLWP means are significantly larger than the observational CLWP.
To summarize the multi-model performance of CMIP3 and CMIP5 in representing the time-mean pattern of CLWP, Fig. 6 illustrates the multi-model mean biases against the observed estimate calculated across the model ensembles. The observational estimate used in this calculation is the MODIS CLWP filtered for no rain and no convection conditions. Note that the MODIS filtered CLWP is provided only over oceans so the ensemble mean is obtainable only over oceans. The patterns of CLWP bias in CMIP3 (Fig. 6a) and CMIP5 (Fig. 6b) are systematically similar: CLWP is overestimated globally. In CMIP3, the ITCZ/SPCZ, high latitudes, and storm tracks are significantly biased high (65.2 g m -2 ), while in CMIP5 the bias (78.3 g m -2 ) is even higher. Therefore, based on this initial comparison, the fidelity of CMIP5 models in representing cloud liquid mass exhibits no progress relative to CMIP3.
The CMIP3 and CMIP5 multi-model biases against the observational estimates of CLWP have implications on radiative flux calculations in the models. The radiative fluxes are affected by LWP from all types of hydrometeors, but all the models in CMIP3 and CMIP5 (except GFDL) include only the contribution from cloud particles excluding the contribution from precipitating and deep convective hydrometeors. Therefore, the underestimation of water mass by including only cloud-only water mass can lead to systematic biases in radiative fluxes. The present study run shows that the cloud-only water mass in the model is significantly overestimated, which may be induced to compensate the error of excluding the water mass from the precipitating and deep convective hydrometeors since the models are better CloudSat CLWP under non-precipitating, non-convective, and non-mixed-phase conditions. (z) MODIS CLWP under non-precipitating non-convective conditions and (aa) MODIS CLWP under all conditions. Here the reference used is from (ab) for CLWP and (y) for lower limit and (aa) as upper limit for CLWP (see section 2).
constrained to achieve the correct radiative fluxes.
To summarize the multi-model performance of CMIP5 in representing the time-mean pattern of CLWP, all the models are biased high (more than factor of 2 and even 3) in mid-and high-latitudes except for some CMIP5 models like CNRM, CSIRO, two GISS models, MPI and MRI as well as all the uncoupled models (UCLA and GEOS5) and reanalyses (MERRA and ECMWF-Interim) are comparable to the MODIS unfiltered CLWP. CMIP5 models show that mid-and high-latitude regions have stronger biases than tropical regions.
To further quantify and synthesize the comparative information discussed above, we use a Taylor diagram (Taylor 2001) as we did early for the IWC ). The Taylor diagram used in this study is a very commonly used statistical metric that relates two statistical measures of model fidelity: the spatial correlation and the spatial standard deviations (Taylor 2001). These statistics are calculated for the long-term time mean and over the global ocean-only domain (area-weighted). The reference dataset is plotted along the x-axis at the value 1.0. The radial distance from the origin is proportional to the ratio of the standard deviations of the given dataset relative to the reference dataset. The azimuthal angle represents the spatial correlation between the given dataset and the reference dataset. The ratio of the standard deviation exhibits the relative amplitude of the simulated and the "reference" variations, whereas the correlation indicates the degree of similarity of variation between the two.
We use the MODIS filtered CLWP as the reference for the Taylor diagram analysis. The Taylor diagram shown in Fig. 7 summarizes both the degree of agreement in the overall spatial pattern correlations and the standard deviations for the individual CMIP5 CGCMs, their multi-model mean, two analyses, three other GCMs. MODIS unfiltered CLWP is used as another observational estimation to give a range of the observational uncertainties. The two reanalyses (ECMWF and MERRA; see Table 1b) and AGCM simulations (i.e., specified SST; GEOS5, see Table 1b) perform as a group considerably better than the CMIP coupled GCMs in terms of the standard deviation ratio. The former have correlations between 0.2 and 0.4 and standard deviation ratios of between 0.8 and 1.1. The AGCM simulations are expected to perform better in various ways (such as spatial distribution of ITCZ, SPCZ, and SACZ) compared to CGCM simulations due to the use of prescribed SSTs. In a coupled GCM, the horizontal spatial distribution of surface ocean heat fluxes (radiation, latent and sensible as well as surface wind stress etc.) is important in driving model ocean dynamics (in particular in a long climate run) and therefore the distribution/values of the SSTs which determines the locations of the warm pool, ITCZ/SPCZ as well as the distribution/values of LWP/IWP and cloud fraction etc. In an uncoupled AGCM, on the other hand, prescribed SSTs are used, such that its surface heat fluxes, PBL, cloud-topped PBL and convection with cloud rooted from the PBL are strongly constrained by these prescribed SSTs, resulting in more realistic precipitation, deep convection and cloud and their associated LWC/LWP and IWC/IWP.
For the group of CMIP5 values (red), most of them have correlations between from about 0.3 to 0.8 with standard deviation ratios above 1 and up to 4 except for the two Inmcm4 models with values of 0.44. Two bcc models and the two MIROC models (MIROC and MIROC-CHEM) as well as CCSM4 have the standard deviation ratios that are highest among all with values of 2.7 -3.6. Other poorly represented CLWP fields with extremely weak standard deviation by this metric are exhibited by the CMIP5 Inmcm4 and Inmcm4ESM models. In terms of the distance to the reference point, the two best performing CMIP5 models by this metric are CSIRO (F) and MPI (Q) with the correlation and standard deviation ratios of about 0.4 -0.6 and 1.0 and 1.4, respectively. Since the precipitating LWP is not considered here, so we do not include the GFDL model in this comparison. In principle, GFDL should be compared with observational TLWP with contributions from precipitating particles since GFDL LWC includes both precipitating and nonprecipitating particles. Currently, the observational TLWP estimations are not available as discussed in section 3.
For the non-CMIP5 models (black in Fig. 7), the EC-Interim (B), MERRA (C), and GEOS 2.5 (A) perform well relative to the others in this group in terms of standard deviation ratio but it has the advantage in this case of being an AGCM-only run, and thus uses specified SSTs and with assimilation for MERRA and ERA-Interim, while all other models examined here are fully coupled. Noteworthy in this regard is the relatively good performance of the UCLA GCM (non-CMIP5) with values of 1.2, with metric values much better than most of the CMIP5 GCMs [except the models of MPI (Q) and CSIRO (F)]. In most cases, the correlation values are poor except NorESM1 (0.74), two BCC models (0.8). Two MIROC models even have negative correlations and CCSM4 and MRI have almost no correlation.
Next, we examine the fidelity of the model CLWC vertical structure using the synergistically constructed observational CLWC shown in Fig. 3. Such a comparison may provide clues as to where the model CLWP values are most awry. A comparison is given in Fig. 8, which shows the CLWC zonal and annual mean values from seventeen CMIP5 CGC-Ms (Figs. 8a -q; note that the CNRM-CM5 CGCM CLWC is  not available from the CMIP5 data portal at the time), UCLA GCM (Fig. 8r) and GEOS5 AGCM (Fig. 8s), as well as MER-RA (Fig. 8t), ERA-Interim (Fig. 8u). These models provide output specifically on CLWC only. The GFDL-CM3 shown in (Fig. 8w), on the other hand, provides output for TLWC including precipitating LWC. Overall, there are significant disparities above 800 hPa among the CMIP5 CGCMs against the observational CLWC with overall discrepancies ranging from multiplicative factors of about 0.2 of the observations (e.g., Inmcm4) to factors of 10 (e.g., MIROC-CHEM, MI-ROC GCMs). Moreover, the general structure of their vertical distributions with respect to pressure levels is considerably different among the models. The large model spread might be due to the mixed phase and supercooled phase uncertainty. Each model has a different way to determine the thermodynamic phases of clouds and partition the cloud water mass into LWC and IWC. The different thermodynamic phase determination method can lead to a large variation in LWC in the low pressure levels in particular. About six of the CMIP5 models do a fair job in representing the vertical structure and magnitude of CLWC [i.e., CSIRO (Fig. 8e), IPSL (Fig. 8j), MIROCH4 (Fig. 8k), MIROC5 (Fig. 8l), MPI (Fig. 8o), MRI (Fig. 8p)]. Some models [bcc (Fig. 8a), bccesm (Fig. 8b), CanESM1 (Fig. 8c), CCSM4 (Fig. 8d), NorESM1 (Fig. 8q)] generally tend to qualitatively capture the patterns of CLWC over mid-and high-latitudes below 800 hPa. The GEOS5 model and two analyses from ECM-WF and MERRA analyses, on the other hand, tend to slightly overestimate CLWC in the tropics above 600 hPa. The UCLA GCM shows relatively underestimated CLWC values vertically compared to the observational CLWC values. However, it is reasonable to exercise caution when considering the fidelity of the observed values in these mid-lower tropospheric regions, or anywhere around the freezing level as the observational data from both CloudSat and MODIS retrievals are filtered for no mixed-phase clouds. Therefore, any liquid contributions from the mixed-phase clouds are not counted in the observational CLWC. Compared to the observed CLWC (Fig. 8x), the GFDL model captures the ITCZ in tropical regions, the extra-tropical storm track and polar regions pretty well but slightly overestimates CLWC above 600 hPa. To summarize, for many CMIP5 models, it is apparent from Figs. 6, 7, and 8 that significant disparities exist not only horizontally in CLWP but also in the vertical structure of CLWC. Figure 9 summarizes some of the basic features of Fig. 8 by showing the global model mean vertical profiles (80°N -80°S), against the global observed mean CloudSat CLWC (the thick blue line) and CloudSat+MODIS combined CLWC (the thick red line). Some models, in particular, the MIROC and MIROC-CHEM models, significantly overestimate CLWC with sharp increases below 500 hPa, while the two Inmcm4 models significantly underestimate CLWC at all levels. Most of the models, in particular, the CCSM4, bcc, NorESM1, MIROCh4, and MIROC-ESM models, significantly overestimate CLWC with sharp increases above 600 hPa, The best simulated CLWC vertical profile is from CSIRO, followed by the MIROC5, MRI, and MPI, CanESM2 models (although keep in mind that CNRM-CM5 CGCM is not available for CLWC). Both MERRA and GEOS5 well capture CLWC in the lower troposphere.
By decomposing the global averages into various belt averages, we show in Figs. 9b, c, and d the average vertical profiles from tropical convectively active regions (30°N -30°S) and mid-and high-latitudes of both hemispheres (30 -60°N and 30 -60°S). Over the tropical convectively active regions (Fig. 9b), the models except MIROC5 generally do not capture the correct CLWC peak, but CLWC values vary greatly from model to model and they all significantly overestimate CLWC at all levels except for GEOS5, two reanalyses and UCLA GCM. From Figs. 8c and d, we find that a large model bias exists in mid-and high-latitudes of both hemispheres and below 700 hPa, especially over the southern high-latitudes, similar to that in IWC reported in Li et al. (2012). Besides the two Inmcm4 models, which are biased low relative to the observed estimate throughout the troposphere, the other models are biased high at all levels in tropics. However, it is important to keep in mind that the observed estimate of CLWC excludes the contribution from mixed-phase clouds. The GFDL-CM3, which includes water mass from precipitating particles, does not have a counter part of observational TLWC, so no comparison can be made.
Finally, in order to determine if there are systematic biases across the models in the vertical structure of their CLWC fields, we examine the spatial correlation and standard deviation at each level, including 900, 850, 800, 700, 600, 500, and 400 hPa, for all the models and the multimodel mean against the observed CLWC values (i.e., Taylor Diagram analysis). A Taylor diagram representing each pressure layer of the annual mean CLWC for the CMIP5 multi-model mean (red), MERRA (blue), and ECMWF-Interim (green) are shown in Fig. 10. The CMIP5 multi-model mean shows poor correlations, even with some having zero and negative values, with observations at all levels considered, while it shows reasonable standard deviation ratios for 900, 850, and 800 hPa levels. The large ratios for 700, 600, and 500 hPa levels are partly due to the exclusion of the mixed-phase cloud mass in the observational estimate of CLWC since the observational partitioning of the mixedphase cloud into liquid and ice is not reliable.

summAry And dIscussIon
To assess the fidelity of GCMs in simulating cloud liquid water, liquid water path (LWP) retrieved from one passive satellite sensor, MODIS and the vertically-resolved liquid water content (LWC) estimates from the satellite radar, CloudSat, are combined synergistically. We find that the patterns of CLWP bias in CMIP3 and CMIP5 are systematically similar with most models being biased high (Fig. 6). When MODIS filtered CLWP is used as a metric, there is overall a fairly wide disparity in the fidelity of CLWP representations in the CMIP5 models examined. Even for the annual mean maps considered, there are easily factors of 2, and nearly up to 10, for the differences between observations and modeled values for most of the GCMs over a number of regions . There is only one model, CSIRO, among the CMIP5 ensemble examined here that performs rather well in regard to the Taylor diagram metrics (i.e., standard deviation ratio and pattern correlation) for CLWP. The following fair performers are MPI and MRI. The models that perform particularly poorly include MIROC, MIROC-CHEM, bcc, bcc-esm, INMCM4, and INMCM4-ESM, with the former (latter) four (two) being biased high (low) significantly in terms of overall CLWP magnitude. The remaining models exhibit intermediate performances.
As expected, the two reanalyses examined perform relatively well compared to the GCM group as a whole due to the incorporation of a wide array of constraining observations. This is still notable though since they do not assimilate cloud liquid observations and thus rely on (parameterized) model physics to represent this quantity. However, even with the assimilation of many other/related quantities, neither MER-RA's nor ECMWF-Interim's performance was within the uncertainty of the observations for the pattern correlation. The UCLA GCM is one of the best performing CGCMs along with the three identified above (i.e., CSIRO, MPI, and MRI).
Considering the large disparities between the observa-tions and modeled values of CLWP, it is evident that while the models may be providing roughly the correct radiative energy budget, many are accomplishing it by means of unrealistic cloud characteristics of cloud liquid mass, which in turn likely indicates unrealistic cloud particle sizes and cloud cover (e.g., Norris and Weaver 2001;Zhang et al. 2003;Lin and Zhang 2004;Schmidt et al. 2006;Cole et al. 2011;Franklin et al. 2012;Kay et al. 2012). The CloudSat and MODIS combined CLWC (Fig. 3) is generated by complementing CloudSat CLWC by adding the MODIS derived CLWC based on the moist adiabatic assumption with the MODIS CLWP and CALIPSO determined cloud top height. The CloudSat and MODIS combined CLWC is used as a reference for model evaluations. Examination of the vertical structure of CLWC in terms of global, zonal and large-region averages (e.g., high, mid, and tropical latitudes) indicates similar findings in terms of overall performance across the models and reanalyses examined here. Most of the systematic errors in the global-mean vertical profile of CLWC occur below the mid-troposphere where the models tend to significantly overestimate CLWC compared to the observed estimate. No model (except CSIRO in tropics) generates reasonable spatial variability at all levels, in particular, above 800 hPa.
Given that there have been viable observed estimates of CLWP from MODIS and CLWP/CLWC for about 4 -5 years from CloudSat yet still GCMs exhibit such large biases, the large disparity between the models and observations indicates challenges of utilizing the observations by all the modeling groups. These challenges likely include the uncertainty of CloudSat LWC below 900 hPa and the failure of retrieval under precipitating/convective conditions that limits the development and improvement of the model PBL cloud and convection parameterizations. The complexity of the present study in using the observational data is an evidence of the difficulty of utilizing the observational data.
Beside the conventional CGCMs, no counterpart of TLWC/TLWP observations that include drizzle and rain is available for evaluating the performance of the GFDL-CM3. The lack of the TLWC/TLWP observational data is because that it is difficult to quantify the drizzle and rain from measurements of remote sensors.
In this study, we mainly focused on CLWC/CLWP comparisons between models/analyses and satellite retrievals. It is beyond the scope of this study to probe the causes of the observation-to-observation, model-to-model differences and model-to-observation biases. However, highlighting a few outstanding questions is instructive to help keep in mind the complexity associated with modeling atmospheric liquid water. This includes representing/ parameterizing PBL clouds, shallow and deep convection, evaporation processes, autoconversion, cumulus detrainment, the overall interplay between these different physical parameterizations (and the large scale dynamics).
While the observations exhibit considerably better agreement in CLWP values than the CMIP5 models shown in the Taylor plot, there is a clear disagreement over the tropical western Pacific (i.e., warm pool), ITCZ/SPCZ and mid-latitude storm track. Several factors could contribute to the disagreement between these observational estimates. The deficiencies from passive detection include the presence of multi-level, mixed-phase, and thick clouds as well as surfaces that are bright and/or that have variable emissivity, each of which can represent a significant challenge for passive techniques, and are limited to estimates of CLWP with no/ poor profiling capabilities. The main problem of CloudSat CLWC/CLWP retrievals is that they fail whenever there are drizzle and rain conditions and therefore the CLWC retrieval is difficult and often fails below PBL top and convective core under heavy rain. In addition, CloudSat CLWC/CLWP in PBL low clouds (normally lower than 900 -1000 m) is often not retrieved because of the radar clutter issue near the surface. The PBL stratocumulus clouds, deep convective clouds, and shallow cumulus clouds are critical for the model cloud representations of the regions off the coasts of California and Peru in CGCMs but the CloudSat CLWC/CLWP has a limited use in evaluating the clouds in the regions.
In addition, at the time of this study we used existing flags for precipitation in the CloudSat 2C PRECIP-COL-UMN product to distinguish precipitating (including drizzle) from non-precipitating clouds, which have valid flags over only oceans. Recent research has more systematically addressed the identification (Haynes et al. 2009) and quantification (Lebsock and L'Ecuyer 2011) of precipitation from CloudSat data leading to the development of experimental precipitation data products. As of this date, there remain inconsistencies between the various CloudSat products regarding the identification of precipitation, resulting in the existence of highly uncertain cloud water content retrievals in the presence of precipitation. An effort is under way to reconcile these inconsistencies for future product releases; however, determining quantitative information regarding cloud water content in the presence of precipitation will remain a challenge for some time to come.
Finally, the ability to explicitly represent both the cloudy, convective core and precipitating components of the liquid (and ice) mass has important physical considerations apart from just having additional observational constraints. While more work needs to be pursued in this area, there is a strong suggestion that GCMs should strive to explicitly represent a broader range of ice Li et al. 2012) and liquid hydrometeors, namely the larger falling hydrometeors (rain, snow) and include their effects in the radiative heating calculations which for the moment, although indirectly taken into account in many cases, is largely ignored from an explicit point of view . Moreover, along with the evaluation results of this study, this consideration of the radiative impacts provide an additional evidence that the radiation balance in the CMIP class of GCMs which matches observations is still underconstrained and in many cases achieved in unrealistic ways. Taken together, these points indicate the need for (1) much improved retrieval algorithms that include observations from multiple platforms; (2) data-assimilation of spacebased observations of cloud properties (something which is currently remarkably absent); and (3) additional observational resources to adequately characterize and constrain cloud-precipitation-radiation interactions. This is likely to include multi-channel radar/lidar information to characterize the profile and spectrum of cloud and precipitation particle sizes as well as Doppler radar capability to provide information on cloud and precipitation dynamics. er colleagues from climate modeling centers for providing model information. Thanks also to Gregory Huey with and Bin Lin/NASA LARC for reading and providing comments on the manuscript. This study was carried out on behalf of the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. The contribution of Hsi-Yen Ma to this work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344.

APPendIx A
(1) the uncertainties and Limitations of modIs cloud LWP Platnick et al. (2003) summarize most of (although not all) errors associated with the retrieval of cloud optical properties and thus by inference liquid water path inferred in this way. Their analysis indicates that the characteristic optical depth error of low clouds is about 10% over most of the range of optical depths with much larger errors occurring for smaller optical depths especially for clouds over land due to the creeping influence of the surface albedo uncertainties in these cases. The errors on effective radius for low clouds are similar in the range from 10 -20% but are substantially more uncertain for ice clouds. Platnick et al. (2003) omit contributions from model errors that are discussed in details in Stephens and Kummerow (2007) and for certain retrieval problems [such as broken clouds with three-dimensional (3D) effects] these omitted errors dominate the error budget.
There are also other limitations of these data that are not considered in Platnick et al. (2003). The MODIS optical measurement loses sensitivity for large LWPs since the reflected sunlight saturates to almost constant values at large optical depths. The optical methods cannot retrieve meaningful water paths in mixed-phased clouds or clouds masked by high level ice clouds. Thus, optical methods cannot sample the full distribution of liquid water paths and are limited primarily to warm single layered cloud systems. The method also is subject to uncertainties introduced by the existence of undetected high clouds that occur overlying low clouds. These events are not negligiblele and appear as problematic biases in current radiance based climatology, (e.g., Haynes et al. 2011;Mace and Wrenn 2013). The third limitation of these optical measurements is that they tend to be weighted high in the cloud and contain restricted information about the liquid water path associated with precipitation.
These limitations are confirmed by a comparison study with in situ observations from VOCALS-Rex (King et al. 2013). The MODIS retrieved effective radius of the droplet size distribution overestimates the in situ measurements on average by 13% with largest overestimation coinciding with the presence of the drizzle sized droplets. The optical depth retrieved from MODIS also overestimate in situ values. These two high biases lead to an overestimation of liquid water path in MODIS. Although these limitations have to be kept in mind in considering the results to follow, we conclude that under the most ideal conditions (no precipitation, no high cloud, fully overcast without any significant 3D effects) the MODIS LWP uncertainty is expected to be in the range of 10 -30% (King et al. 2013).

(2) the collocation of cloudsat-modIs-Amsr-e
The MODIS-collocated-to-CloudSat and AMSR-Ecollocated-to-CloudSat datasets are a part of the CloudSatcentric Co-location Data Products for A-Train data and ECMWF analysis outputs available at http://csyotc.cira. colostate.edu/index.php. The data products are generated for the Year of Tropical Convection (YOTC) study (http:// www.ucar.edu/yotc). Specifically, the data products contain convection and cloud related A-Train satellite retrievals that are collocated to CloudSat footprints for the period May 2008 to April 2010. Products include quantities from CALIPSO, MLS, AIRS, AMSR-E, CERES, MODIS, and a number of fields from the specialized YOTCO analysis produced by ECMWF. The co-location data products are produced to facilitate the synergistic use of the multi-source datasets by providing them in common geo-location parameters and a common data format.
The co-location process is done by finding the nearest neighbor footprint in the source data (i.e., A-Train or ECMWF data) for a given footprint in the target data (i.e., CloudSat data). For A-Train co-location, if the distance between the nearest-neighbor footprint and the target footprint is smaller than 1.5x (the sum of the two footprint sizes) and the time difference between the two data is smaller than about 15 minutes (which is the largest time span of the A-Train instruments), we accept the nearest-neighbor footprint and co-locate it onto the target footprint. Neither spatial averaging nor interpolation is applied in the co-location process. The value from the accepted nearest-neighbor footprint is simply copied over to the target footprint. We take this simple nearest-neighbor copying approach because an appropriate averaging and interpolation varies from instrument to instrument and from variable to variable and re-quires detailed and individual analyses.
The purpose of the collocation was performed in order to apply the CloudSat-based precipitation and cloud condition flags to the MODIS LWP data to partition these data into the suspended (cloud-only), convective and precipitating portions for reasons discussed below. This collocation process reduces the size of temporal and spatial resolutions of the MODIS data significantly. The collocated and partitioned MODIS LWP retrievals are mapped onto 1 × 1 degree grid and averaged for all sky conditions. Note that MODIS MYD06_L2 product includes only LWP values from cloudy conditions and treats the LWP retrievals from a clear sky condition as invalid. In order to make the all sky averaged LWP, we set the LWP retrieval from the clear sky condition to be zero and add it to the averaging. This process is effectively the same as averaging the cloudy-condition LWP values only and multiplying it by the cloud fraction in each grid.

(3) cloud detection capabilities of cloudsat and modIs
Apart from the retrieval uncertainty, the cloud detection capabilities of CloudSat and MODIS are an important factor in assessing the observational uncertainty. When the CALIPSO lidar measurements are combined with the Cloud-Sat radar measurements, they provide the most accurate measurement of cloud detection and height structure available among all satellite-based measurements today because their complementary features. The radar can probe optically thick layers better than the lidar while the lidar can sense optically thin layers and tenuous cloud tops better than the radar. CloudSat 2B-Radar-Lidar GEOPROF product is a retrieval product, which uses the synergy of the radar and lidar.
Given that the radar and lidar combined detection is superior to other measurements, however, the radar-only measurement misses most low clouds, which is attributed to the radar surface cluttering issue. This is consistent with previous observational studies such as Marchand et al. (2008) and Kubar et al. (2011). While MODIS misses some clouds compared to the lidar and radar combined detection level, MODIS detection capability is overall within 10% error. These detection deficiencies will unavoidably lead to systematic biases in the annual means of CLWC and CLWP retrieved from the CloudSat and MODIS measurements. Quantifying the systematic biases is difficult because there is little information about the cloud water mass (CLWC/ CLWP) of the clouds that are missed by MODIS. However, the clouds missed by MODIS are expected to be mainly optically thin clouds and thus have low liquid cloud water mass. Therefore, the effect of missed clouds on the annual means of CLWC/CLWP should be much less than 10%, which is the MODIS cloud detection missing frequency as mentioned above.
The majority of the CMIP5 model outputs of cloud liquid water path (CLWP) is available from 1850 -2005. However, the CloudSat is only available starting from June 2006. Due to the battery failure of CloudSat cloud profiling radar, the data after 2011 is daytime only. In this study, we use data from January 2007 to December 2010 to include daytime and night time. The MODIS LWP we use (May 2008 to April 2010) has more profiles collocated path along with CloudSat track.
Since the data period is too short to compute the trend, we compute the annual mean LWP of MODIS data from January 2001 to December 2005 (Fig. A1a) which is closer to the CMIP5 data period, against the annual mean LWP from May 2008 to April 2010 (Fig. A1b), to examine their difference (Fig. A2c) and their changes relative to the annual mean in 2001 -2005 (Fig. A1d). The changes of LWP is within 10% (Fig. A2d), and their difference (Fig.  A2c) is about 4 -6 (g m -2 ), which is well below the biases in CMIP3/CMIP5 (values of 20 up to 100 g m -2 ) shown in Fig.  6. In addition, we compute the annual mean LWP of CMIP5 data from January 1970 to December 2005 against the annual mean LWP from January 2001 to December 2005, and their changes relative to the annual mean of 1970 -2005 (figure not shown) is within 2 -4%, which are also well below the biases in CMIP3/CMIP5.

(4) determination of Lfting condensation Level (LcL)
We use the lifting condensation level (LCL) defined as the cloud base height. From Georgakakos and Bras (1984), the LCL is estimated as . ln where P SFC is the surface pressure, T D is the dewpoint temperature in Kelvin, RH SFC is the surface relative humidity, and R V is the gas constant for vapor (= 461 J K -1 kg -1 ). The LCL pressure is converted to height using the CloudSat ECMWF-AUX data.