Evaluation of the SeaWiFS and MODIS Chlorophyll a Algorithms Used for the Northern South China Sea during the Summer Season

The present study made evaluations of SeaWiFS-derived and MODIS-derived Chlorophyll a (Chl a) concentrations in the Northern South China Sea (NSCS), using in situ data collected during two research cruises which were conducted during the summer of 2004 (September 18 to October 8) and 2007 (August 10 to 29). The data of ±48 h and 3 × 3 pixels were used for the comparison between satellite and in situ Chl a data, and the results reveal a systematic overestimation of Chl a concentration by National Aeronautics and Space Administration (NASA) global algorithms (OC2v4, OC4v4, and OC3M). The RMSEs of the selected algorithms are larger than 0.35 except OC2_D’Ortenzio (one regional algorithm for the Mediterranean Sea). The overestimation seems to correlate with numerous (≈77%) low Chl a concentration (< 0.1 mg m-3) due to the oligotrophic characteristics of the South China Sea (SCS) in summer, and to correlate with the error in atmosphere correction introduced by aerosols. Therefore, the OC2 and OC4 algorithms for SeaWiFS and OC3M algorithm for MODIS are adapted to NSCS by fitting the satellite data set to in situ Chl a data in NSCS. With the new coefficients based on our field data, the regional version of the three algorithms (TP series) showed good performance with RMSE values of 0.245, 0.245, and 0.288 respectively, which were slightly higher than the algorithm “noise” (0.222 in RMSE). Those TP series algorithms may be considered preliminary due to the relatively small number of available in situ data, and they are suitable in summer season in NSCS.


INTRODUCTION
Chlorophyll a (Chl a) concentration, a proxy for phytoplankton abundance, is a valuable indicator of the marine ecosystem, and satellite remote sensing is the only way at present to take frequent measurements of Chl a at regional and ocean-basin scales (Richardson et al. 2004).Studies on Chl a concentrations in the South China Sea (SCS) have been carried out using satellite sensors including the Coastal Zone Color Scanner (CZCS) (Tang et al. 1998), Ocean Color and Temperature Scanner (OCTOBERS) (Tang et al. 2002(Tang et al. , 2003)), Sea-viewing Wide Field-of-view Sensor (SeaWiFS) (Tang et al. 2004a(Tang et al. , b, 2005;;Zhao and Tang 2007;Zheng and Tang 2007).
To date, the most typical optical sensors for Chl a survey are the SeaWiFS sensor and Moderate Resolution Imaging Spectroradiometer (MODIS) sensor, and at present the National Aeronautics and Space Administration (NASA) adopted OC4v4 algorithm for the global SeaWiFS processing and OC3M for the global MODIS processing (Esaias et al. 1998;McClain et al. 1998;O'Reilly et al. 2000).Global algorithms for satellite remote sensing do not always provide reasonable retrievals in all areas of the ocean, because an empirical algorithm is only as good as the data it is based on, and on how representative the data are of the environment or bio-optical provinces where the algorithm is to be applied (IOCCG 2006).Evaluation and validation of algorithms in regional sea area always show that revised or new algorithms in regional sea are necessary (D'Ortenzio Pan et al. et al. 2002;Iluz et al. 2003;Darecki and Stramski 2004).Previous work showed that both SeaWiFS and MODIS Chl a data agreed with in situ measurements in most area of SCS, but to be noted that in situ Chl a values were higher (> 0.1 mg m -3 ) in these study areas, i.e., along coastal area and near upwelling area (Tang et al. 2003;Zhang et al. 2006).There are few comparisons between SeaWiFS, MODIS products and in situ data in oligotrophic area in NSCS (Chl a < 0.1 mg m -3 ).
Due to the oligotrophic characteristics of SCS, especially during the summer season (Chen et al. 2004;Chen et al. 2006), the available satellite algorithms may have their limits (Hooker and McClain 2000).Consequently, amendments to the global empirical algorithms of satellite were made and new regional ocean color algorithms were proposed for NSCS (Wu et al. 2004;Xu et al. 2007).In this work, the performance of three globally empirical algorithms (two for SeaWiFS, one for MODIS), and one regional algorithm (OC2_D'Ortenzio for the Mediterranean Sea) in the NSCS are evaluated.Our in situ data set are used to generate regional algorithms and compare their performances with NASA's operational algorithms and the regional algorithm for the Mediterranean Sea.Our study can present an independent analysis of SeaWiFS and MODIS Chl a data in NSCS.

Study Area
The SCS, located along the tropical-subtropical rim of the western North Pacific Ocean and connecting Pacific Ocean and Indian Ocean, is one of the largest marginal seas in the world.It connects to the western Philippine Sea through the Luzon Strait (LS) and the East China Sea through the Taiwan Strait (TWS), and covers a total area of about 3.5 million km 2 from the equator to 23°N and from 99 to 121°E with an average depth of 2000 m (Fig. 1) (Su 2004).
The SCS is a predominantly oligotrophic and ultra-oligotrophic basin (Chen et al. 2004;Chen et al. 2006).However, higher biomass may seasonally and locally occur in regions affected by sea surface temperature, monsoon and river runoff or upwelling (Tang et al. 2002(Tang et al. , 2004(Tang et al. , 2006)).It is dominated by strong northeasterly monsoon during winter and southwesterly monsoon in summer (Liu and Xie 1999), and the monsoons always play an important role in the dynamics of upper circulations of SCS throughout the year (Wyrtki 1961).The SCS is strongly affected by industrial emissions from the northern border and exhibits high Chl a concentration along the coast (Zhao et al. 2005), and aerosol optical thickness of NSCS exhibits obvious and extreme di- urnal change (Liu et al. 2008).Furthermore, it is frequently subject to typhoons which means atmospheric condition is complicated (Elsner and Liu 2003;Wu et al. 2006;Zhao et al. 2007;Zheng and Tang 2007).

In Situ Data Observation
In situ data were collected during two research cruises in NSCS conducted in the summer of 2004 (from September 18 to October 8) and 2007 (from August 10 to 29).Water samples of 1000 ml of surface water were collected from each station and filtered through 200 μm mesh to remove large abiotic particles or zooplankton (Zhou et al. 2004).The samples were filtered again using 0.45 μm cellulose filter papers for the extraction of plant pigments.The filter papers were then stored in 90% acetone for 24 hours in a dark shaded area at 4°C.The spectral absorption of Chl a was measured following the Fluorometric method using the Turner-Design 10 Fluorometer (Parsons et al. 1984).The Chl a values were then calculated using the spectral information.

Temporal and Spatial Considerations
A rigorous comparison requires that in situ data be collected within ±2 -3 h of the satellite overpass (Bailey et al. 2000).However, due to frequent cloud cover and rainfall in NSCS in summer, such matching data pairs are limited.Therefore we measure time differences of ±24 and ±48 h to find the usable time difference between the matching pairs.Satellite navigation may not be accurate to a pixel due to the noise (Patt 2002), therefore, a box of some number of pixels is defined, centered on the location of the in situ measurement, and Bailey and Werdell (2006) suggests a kernel of 5 × 5 (25 pixels).In this work, we do the comparisons between 1 × 1 pixel, 3 × 3, and 5 × 5 pixels to find a suitable box.

Satellite Data Processing
SeaWiFS daily Level 1A (L1A) data were downloaded from the NASA OceanColor Home Page (http://oceancolor.gsfc.nasa.gov/).They were processed up to Level 2 (L2) Chl a data to obtain remote sensing reflectance (R rs ) maps for the four available visible bands (443, 490, 510, and 555 nm) using the SeaWiFS Data Analysis System (SeaDAS 5.4) software which implements a modified atmospheric correction method (Gordon and Wang 1994) and then mapped to a cylindrical equidistant projection at ~1 km pixel -1 resolution.
Daily MODIS/Aqua L1A data were obtained in the same way as SeaWiFS.They were first processed to the corresponding Level 1 B (L1B) data, and then to L2 products to obtain three available visible bands (443, 488, and 551 nm) using SeaDAS 5.4.These data were mapped in a manner similar to that used for SeaWiFS.

Evaluation Analysis
General comparison methods used in the validation analysis for SeaWiFS from NASA (http://seabass.gsfc.nasa.gov/seabasscgi/validation.cgi) and in recent literature (O'Reilly et al. 2000;Gregg and Casey 2004;Zhang et al. 2006) are employed to do the evaluation in this study.Parameters extracted from analyses include median ratio, median difference, root mean square log error (RMSE) and average difference (bias), which describe the fidelity of satellite data.Median ratio and median difference are expressed as:

Median
Where S indicates satellite data, I indicates in situ data, and n is the number of samples.The RMSE is an estimate of the error of the satellite data set, the average difference is an estimate of the bias, and the coefficient of determination (r 2 ) from the correlation analysis indicates the covariance between the satellite data set and the in situ measurements.
Because the natural distribution of Chl a is lognormal (Campbell 1995), both in situ and Satellite data should be logarithmically transformed (base 10) before comparison.
The performance of the algorithms in NSCS can be then evaluated with those statistical analyses.

Temporal and Spatial Considerations
There were 40 stations monitored in the 2004 cruise and 56 stations in the 2007 cruise (Figs.2a, b).From Fig. 2 we can see that low Chl a concentrations (≤ 0.1 mg m -3 ) present in most areas in the NSCS, especially in northwestern Luzon; high Chl a concentrations (> 0.5 mg m -3 ) are observed in coastal waters and upwelling areas.Two in situ stations located in northwest Luzon show the lowest Chl a values of 0.003 and 0.005 mg m -3 (Figs.2c, d), which are lower than 0.008 mg m -3 (O' Reilly et al. 2000), and two in situ stations located in the costal and upwelling area show higher Chl a values of 0.601, 0.507 mg m -3 (Figs.2c, d).
Considering that such kind of Chl a data are unrepresenta-tive and may induce great error in our comparison (IOCCG 2000), they are removed in advance.
There are only 4 matching pairs for SeaWiFS/in situ and 7 matching pairs for MODIS/in situ when considering the difference of ±3 h (Figs.2c, d).It is apparent that the matching pairs of ±3 h are too few to do meaningful statistics.There are 17 matching pairs according to the time difference of ±24 h for both SeaWiFS/in situ and MODIS/ in situ, and 36 pairs for SeaWiFS/in situ and 35 pairs for MODIS/in situ according to the time difference of ±48 h (Figs.2c, d).Statistics results are almost the same for Sea-WiFS/in situ comparison according to the time difference of ±24, ±48 h, and small difference for MODIS/in situ comparison between time difference of ±24 and ±48 h (Table 1).There are small difference between ±24 and ±48 h, in addition, more matching pairs may be advantageous to our algorithms evaluation, so we selected ±48 h as the temporal considerations.
Statistical results of 1 × 1 pixel box are worse than the other two size of 3 × 3 and 5 × 5 pixels.The r 2 of 1 × 1 pixel box are the lowest for both SeaWiFS and MODIS (Table 2), such results may be mainly caused by vibration due to noise.The results are almost the same for 3 × 3 and 5 × 5 pixels.According to the principle of less inaccuracy, 3 × 3 pixels is the suitable choice.

Algorithm Presentation
Four algorithms are selected for evaluation, including three empirical algorithms (OC2v4, OC4v4 for SeaWiFS, OC3M for MODIS), and one regional algorithm proposed by D'Ortenzio for the Mediterranean Sea (Esaias et al. 1998;O'Reilly et al. 1998O'Reilly et al. , 2000;;D'Ortenzio et al. 2002).The formula of these algorithms and the numerical value of the coefficients are shown in Table 3.

Algorithm Evaluation and Adaptation
Scatter plots of satellite versus in situ Chl a values for each selected algorithm (OC4v4, OC2v2, OC2_D'Ortenzio, and OC3M) are shown in Fig. 3, the median ratio, median difference (%), slope, intercept, r 2 , RMSE and bias are listed in Table 4. Results show that algorithm OC2_D'Ortenzio has the highest fidelity with RMSE = 0.289, median difference = 82.655,slope = 1.403, while the RMSE and Median difference values of the other three algorithms are all above 0.35 and 100.The statistics parameters clearly show that all algorithms overestimate Chl a concentration except OC2_D'Ortenzio.OC2v4 has the largest overestimation because of the median difference = 217.909and median ratio = 3.179.The r 2 is almost identical for OC2v4 (0.734)   [0.3439, -6.564, 14, -15.61, 6.255] Present paper and OC3M (0.733), and the OC4v4 algorithm shows the highest r 2 (0.791).We can see that OC4v4 performs better than OC2v4 which is well known.A better performance is found with OC2_D'Ortenzio than OC2v4.The significant bias (> 0.1) of the retrieved data suggests, however, that it may be possible to improve the performance of these algorithms if we adapt the standard parameter values (the various regression coefficients) with new values determined from our field measurements in the NSCS.Three new sets of coefficients are calculated by the non linear least squares method, trust region algorithm and then OC4_TP, OC2_TP, and OC3M_TP algorithms are generated (Table 3).The comparisons between the three new models and the in situ Chl a data set show that the scatter plots are now distributed around the line of best agreement (Fig. 3).A good relationship can be seen between in situ and algorithm-derived Chl a concentration (Table 4): the slopes and r 2 are improved, generally, and the RMSE and bias are lower.The RMSE of three new algorithms are all within 0.35, and the high accuracy are shown with bias < 0.1, which is much better than the four evaluated algorithms.

Temporal and Spatial Selection
Cloud cover is frequent especially during the summer season in NSCS because typhoons frequently occur with heavy cloud coverage (Wu et al. 2006).According to the time difference of ±48 h, the matching pairs in the 2004 cruise are 32 for SeaWiFS/in situ and 27 for MODIS/ in situ, respectively.The number of matching pairs is 8 for SeaWiFS/in situ and 11 for MODIS/ in situ in the 2007 cruise.The reason is probably due to fewer typhoons in the 2004 cruise than in the 2007 cruise (http://gis.typhoon.gov.cn/typhoonweb/).So a time difference of ±48 h might be a reasonable choice in summer in NSCS (Zhang et al. 2006(Zhang et al. , 2007)).
The spatial coverage includes moderately eutrophic coastal waters along Guangdong Province and Hainan Island, upwelling area in Taiwan Strait, and oligotrophic waters in the open area of NSCS (Fig. 2).In order to reduce the impact of geophysical variability a small box of 3 × 3 pixels is reasonable (Bailey and Werdell 2006).

Comparison between Selected Algorithms and New Algorithms
The results presented in the previous section raise the question of why the global empirical algorithms overestimate Chl a concentration but regional algorithm by D'Ortenzio show relatively better performance in NSCS.
One possible cause may be from our special in situ data set which covers a range spanning from very oligotrophic (the western area of the Luzon Strait) to eutrophic regimes (coastal and upwelling areas).Field Chl a values vary between 0.013 and 0.426 mg m -3 , though low values (< 0.1 mg m -3 ) are definitely more numerous (≈77%).In summer Chl a concentration is lower than the other seasons because of the relatively high sea surface temperature, but strong offshore currents are often induced by southwesterly monsoons, and stronger wind-stress areas usually accord with higher Chl a concentration areas (Zhao et al. 2005).This kind of Chl a distribution may bring in situ data set that does not fit the data set of SeaWiFS Bio-optical Archive and Storage System (SeaBASS) which trend to be underrepresented in oligotrophic waters whereas overrepresented in mesotrophic and eutrophic regimes (O'Reilly et al. 1998).The other possible reason may be the errors in the atmospheric correction caused by aerosols (McClain et al. 2006).The aerosol optical thickness is the primary parameter in the atmospheric correction algorithm, however, the aerosol optical thickness of the NSCS in summer has an obvious diurnal change, and wind blowing from land induced by typhoons has an obvious influence on the aerosol optical thickness (Liu et al. 2008), which might induce more errors in the atmospheric correction.
The results show that OC2_D'Ortenzio plays a good role in NSCS.The reason appears firstly due to the fact that their low Chl a values (< 0.1 mg m -3 ) are numerous (≈70%) too.In this study the proportion of low Chl a concentration (< 0.1 mg m -3 ) stands at about 77%.Secondly, the aerosols are both from land (China Mainland at the northern border of NSCS, Europe at the northern border of the Mediterranean Sea) ( D'Ortenzio et al. 2002) which can induce similar atmospheric correction errors.

SUMMARY
The major aim of this paper is the evaluation of four algorithms (OC2v4, OC4v4 and OC3M, NASA's operational algorithms and OC2_D'Ortenzio as an example of regional NSCS algorithms) in the Northern South China Sea during the summer season.Temporal and spatial considerations of ±48 h and 3 × 3 pixels are determined, and evaluation results show a systematic overestimation of Chlorophyll a (Chl a) concentration by NASA global algorithms.The systematic misfit appears to be correlated with numerous (≈77%) low Chl a concentration (< 0.1 mg m -3 ) in our in situ data set and with imperfect atmospheric correction from aerosol optical thickness.Thus, based on our field data, we are not able to define which known algorithm should be preferred for the NSCS.For this reason we generated three amending algorithms, retrieved by fitting our NSCS in situ data set with OC2_like, OC4_like, and OC3M_like formula.The new TP series algorithms perform well with higher accuracy (bias < 0.1) when applied to the in situ measurements.
Due to the relatively small number of available in situ data and the fact that our in situ data set only represents the oligotrophic ocean conditions in NSCS in the summer, the generated algorithms have to be considered to be very preliminary and deep research into the reasons of global algorithm misfit are still needed.A larger data set of bio-optical in situ measurements is useful and obviously necessary to produce a finely tuned algorithm for a region like SCS.
log error (RMSE) and average difference defined as:

Table 1 .
Comparison between satellite Chl a derived from the new algorithm and in situ Chl a according to the time difference of 24 and 48 h.

Table 2 .
Comparison between satellite Chl a derived from the new algorithm and in situ Chl a according to the image pixel of 1 × 1, 3 × 3, and 5 × 5.

Table 3 .
Formulations of the empirical algorithms, published localized algorithms and new algorithms for the oligotrophic Northern South China Sea.

Table 4 .
Summary of the error analysis for the algorithms presented in Table3.