Peak ground acceleration estimation using P-wave parameters and horizontal-to-vertical spectral ratios

Peak ground acceleration (PGA) can be used to estimate the seismic intensity. However, using P-wave features to estimate PGA is a challenging task. One of the reasons for that is that a seismic wave commonly undergoes modification due to various site effects, consequently leading to uncertainty in the predicted PGA. In order to accommodate site effects using site parameters together with P-wave parameters, this paper takes advantage of machine learning to consider multiple parameters simultaneously. Several artificial neural network (ANN) models considering different site effect parameters are constructed. The performances of these ANN models were investigated and compared. In total, 53531 ground motion data obtained from the Taiwan Strong Motion Instrumentation Program were utilized to develop the proposed approach. It was found that the proposed ANN model with horizontal-to-vertical spectral ratio parameters effectively reduces the error of the estimated PGA when compared with either the ANN model without site parameters or the ANN model with other site parameters. Article history: Received 22 March 2019


INTRODUCTION
Earthquake early warning (EEW) techniques have been widely studied during the past two decades. Generally, EEW systems can be divided into two categories: regional warning techniques and on-site warning techniques. Basically, regional EEW systems require to collect P-wave data of several stations of a seismic network next to the epicentral area. They often use the P-wave arrivals and waveform amplitudes of several stations to predict magnitude and epicenter distance of the earthquake, then the ground motion intensity can be estimated with the ground motion prediction equations (e.g., Allen and Kanamori 2003;Zollo et al. 2006). However, Taiwan frequently suffers from destructive seismic hazards due to inland earthquakes. For regions that are close to the epicenter, where the seismic intensity is usually much higher than that of regions outside, the regional warning time before a destructive wave arrives can be zero. On the other hand, because the on-site EEW systems only require the seismic information of the target site, a longer warning time can be obtained in regions that are close to the epicentre (Satriano et al. 2011). Many on-site approaches have been developed, e.g., Nakamura (1988), Odaka et al. (2003), Wu and Kanamori (2005), Böse et al. (2012), Hsu et al. (2013Hsu et al. ( , 2016Hsu et al. ( , 2018. More insight discussion of the difference between regional and on-site EEW can be found in Satriano et al. (2011).
Conventionally, an EEW technique is produced by using empirical regression models established based on a simple linear relationship between the logarithm of only one or two kinds of P-wave parameters, e.g., predominant period and the peak displacement, and the measured earthquake information (Nakamura 1988;Kanamori 2005;Odaka et al. 2005;Yamamoto et al. 2008). However, complex relationship between output target and input parameters may exist which makes improving accuracy of EEW a challenging task. Fortunately, machine learning is a promising field to solve many challenging problems. An artificial neural network (ANN) is a nonlinear statistical data modeling tool that Terr. Atmos. Ocean. Sci., Vol. 31, No. 1, 1-8, February 2020 models the complex relationship between inputs and outputs. In this study, an ANN algorithm is employed to predict the PGA of ground motion records based on P-wave parameters.
Another factor that greatly influences the performance of an EEW system is the site effects. In engineering applications, the average shear wave velocity of the upper 30 m of sediments (Vs30) is a simplified parameter, which has been widely used to classify sites. The National Earthquake Hazard Reduction Program (NEHRP) recommended the use of Vs30 as a significant indicator for classifying sites in building codes (BSSC 2001). Böse et al. (2012) achieved a great reduction of the uncertainty in estimating the peak ground velocity (PGV) by taking Vs30 into account during the analysis. Furthermore, Nakamura (1989) proposed the single station horizontal-to-vertical spectral ratio (HVSR) method using microtremors to evaluate dynamic characteristics of surface layers. The authors would like to further consider these site effect parameters when establishing an ANN regression model to estimate the PGA.
In this study, P-wave parameters calculated from the first 3 s interval of a P-wave signal and the physical quantities related to site effects were implemented in the establishment of ANN regression models to estimate the PGA of ground motion records. The PGA is selected as the target because the official earthquake intensity determined by the Central Weather Bureau (CWB) in Taiwan is calculated based on the PGA. The performances of ANN models with and without these site effect parameters were compared and discussed. Hsu et al. (2013) and Huang et al. (2014) considered six P-wave parameters of the first few seconds of vertical direction measurement, including the peak absolute value of acceleration, peak absolute value of velocity (Pv), peak absolute value of displacement, effective predominant period, integral of absolute acceleration (IAA), and integral of the squared velocity, to estimate the PGA using different algorithms. Both studies showed that the estimated PGA using only Pv and IAA achieves an approximate prediction error compared with the accuracy obtained by using six Pwave parameters, although the accuracy of the estimated PGA using all six P-wave parameters is the highest. When training the artificial neural network afterward, too many input parameters could require a lot of computational effort, therefore we tried to reduce the number of P-wave input parameters. As a result, we employed only these two P-wave parameters in our analysis. The two P-wave parameters were extracted from the first 3 s interval of a P-wave signal [t p = 3 (s)]. The equation for IAA is given as

P-wave Parameters
where ( ) u t p denotes the vertical component of the acceleration time history of ground motion after the arrival of the P-wave. In addition, all recorded acceleration signals were integrated once to obtain the velocity signals. A secondorder 0.075 Hz high-pass Butterworth filter was applied to remove the low-frequency drift after integration. A shortterm average/long-term average (STA/LTA) algorithm was applied to automatically determine the arrival time of the P-wave. The LTA value was set to 0.03 Gal based on the average noise at the stations. The STA value was calculated by averaging absolute acceleration values of 80 data points; it was triggered if the STA value successively exceeded two times of the LTA value.

Site Characteristic Parameters
Four parameters were considered to address the site characteristic information in this study. Firstly, Vs30, the average shear wave velocity within a depth of 0 and 30 m at each particular station, was determined by the National Center for Research on Earthquake Engineering (NCREE) and the CWB of Taiwan (Kuo et al. 2011(Kuo et al. , 2012(Kuo et al. , 2015. All stations with available Vs30 information were classified into five classes based on the criteria of the NEHRP (BSSC 2001). Secondly, the HVSR was used to assess the characteristics at a particular site. The HVSR was calculated based on the frequency spectrum of the horizontal vibration signal divided by the frequency spectrum of the vertical vibration signal. The Fourier spectrum was calculated using the fast Fourier transform (FFT) of the whole acceleration time history of 8192 points of the Hamming window with a sampling rate of 200 Hz. The Fourier spectra were smoothed every three points. Note that the HVSR values in this study were obtained by using real earthquake events. Because our purpose is to estimate the earthquake intensity of an upcoming earthquake, it is better to use earthquake data rather than ambient noise. Thirdly, the peak frequency (i.e., the frequency value corresponding to the maximum HVSR values) was also considered in this paper. Finally, the NEHRP site class was also employed as site characteristics when establishing the regression model for PGA prediction.

Artificial Neural Network
In this study, we used an ANN to perform a supervised machine learning process to estimate the possible maximum earthquake intensity. The entire ground motion database was divided into two subsets, namely, the training and validation subsets. Because the ground motion database is in the order of seismic events and the data with large PGA is very limited, odd numbers of ground motion data were used for training, while the remaining even ground motion data were used for validation. By doing so, the training data and validation data will contain almost every seismic event and the number of data with large PGA is nearly equal for the training data and validation data. During the process, we applied a four-layer forward fully-connected neural network (Fig. 1) to establish the regression model between concerned parameters and the PGA. The network consisted of one input layer, two hidden layers with ten neurons in each hidden layer, and one output layer. The activation function of the hidden and output layers was hyperbolic tangent and linear, respectively (Gurney 1997). Although different activation functions may achieve similar performance, we simply choose the hyperbolic tangent as the activation function in the inner layer because some literature has already employed it for constructing regression models and acceptable results were obtained (Masri et al. 1996(Masri et al. , 2000. Each type of input parameter is required to be normalized to a domain of [0,1] to adequately train the regression model. Each neuron in the hidden and output layers was connected with a neuron in the previous layer. The values of neurons in each layer were passed to the next layer through the combination of weighting and a bias term. This combination was further fed into an activation function. The general formula of each neuron can be expressed as: where z j was the output of the j th neuron; x i was the output of the i th neuron in the previous layer; w ij represents the weight connecting the i th neuron in the previous layer to the j th neuron in the current layer; b j is a the bias of the j th neuron; N j is the number of neurons in the previous layer; ( ) x y is the activation function.
The network was trained by minimizing the cost function, which is the mean-square error of the logarithm difference between the predicted PGA and target measured PGA: where y j and y j U was the output and target of the j th dataset of the ANN model, respectively; N is the number of total dataset. The cost functions of all regression models were identical. The network was trained using the Levenberg-Marquardt backpropagation algorithm, which is a combination of gradient descent algorithm and Newton's method. The training of the ANN models were terminated to prevent undesired overfitting if the accuracy of the training subset increases but the accuracy of the validation subset remains constant or decreases. We trained 1000 identical models but with different randomly initialized weights and biases of the neurons. Then the comparatively best performing network was selected among the 1000 trained models to increase the chance to avoid local minimums.

Ground Motion Database
To establish a representative ANN regression model for PGA prediction and investigate the influence of the site characteristic parameters, a large number of ground motion records with reliable site information is necessary. The ground motion data employed in this study were obtained from the TSMIP (see the Supplementary data section). The TSMIP network is maintained by the CWB in Taiwan to collect high-quality instrumental recordings of strong ground motions caused by earthquakes around Taiwan. About 700 free-field stations have been installed and are presently operating throughout Taiwan. The TSMIP station signals are digitized with 16-bit resolution or higher. Most accelerographs have a dynamic range of ±2 g. The EGDT (see the Supplementary data section) was built by the NCREE and the CWB in Taiwan and includes logging Fig. 1. Structure of the artificial neural network predicting the PGA. data from more than 400 stations and estimates of the corresponding Vs30 quantities (Kuo et al. 2011(Kuo et al. , 2012(Kuo et al. , 2015. Approximately 15 years of TSMIP data (29 July 1992 to 31 December 2006) recorded by the CWB were utilized, including 54 earthquake events with M L ≥ 6.0 and the 1999 Chi-Chi earthquake with M L = 7.3. Not only local but also distant events were included. Because the P-wave parameters extracted from the first 3 s interval after the trigger were required, the data with a total length < 3 s were excluded from this study. Moreover, stations without reliable site information (Vs30) and high quality acceleration data were not taken into consideration. Thus, a total of 53531 TSMIP data corresponding to 386 stations were available. Table 1 summarizes the site information of the ground motion database used in this study.

RESULTS
The aim of this study was to use ANN to accommodate site effects for PGA predictions. Therefore, the following five combinations of input parameters of ANN regression models were studied. The logarithm residuals shown below are calculated using the all 53531 data, including training data and validation data. The number of input parameters of each ANN regression model is summarized in Table 2 and the detail of the input parameters is described in the following paragraphs: (1) Pv + IAA; this model only considers two P-wave parameters without any site characteristic parameters. Figure 2a shows the predicted PGA using the Pv + IAA ANN regression model. Each star marker represents a ground motion data. The vertical axis in the figures represents the predicted PGA, while the horizontal axis denotes the measured PGA. To evaluate the performance of the ANN regression model, the standard deviation (STD) of the difference between the logarithm predicted PGA and measured logarithm PGA, that is, the STD of the logarithm residuals, lnErr v , was calculated. A lower lnErr v value indicates a more accurate estimation of the PGA value in general. The lnErr v without any site information is 0.5307, which is designated as a reference.  Figure 2b shows the predicted PGA of ground motion data using the Pv + IAA + NEHRP ANN regression model. The lnErr v considering NEHRP site class is 0.5333, which is even slightly larger than the one of the Pv + IAA model.
(3) Pv + IAA + Vs30; this model considers two P-wave pa-rameters and the Vs30 value. Figure 2c shows the predicted PGA of ground motion data using the Pv + IAA + Vs30 ANN regression model. The lnErr v considering Vs30 is 0.5322, which is still slightly larger than the one of the Pv + IAA model. (4) Pv + IAA + peak F; this model considers two P-wave parameters and peak F. The peak F value is the frequency corresponding to the maximum HVSR17 values at each station. The HVSR17 values are the representative HVSR values at seventeen frequencies. Because the content of ground motion with a frequency largely differing from structural natural frequencies will cause little structural damage, we were only interested in the HVSR17 values within the frequency range of engineering applications. In this study, a frequency range between 0.5 and 20 Hz was considered and the HVSR17 curves were investigated at 0.5, 0.8, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 16, 18, and 20 Hz. Each HVSR17 value represents the averaged HVSR values within the corresponding frequency range. The HVSR17 values are the average of all earthquake events recorded at the same station, which means that they are identical for different earthquake data recorded at the same site. Therefore, the peak F input of two vibration events is the same as long as it was recorded at the same station. We determined the peak frequency by simply choosing the frequency of the maximum HVSR17 values of each station to obtain systematic and automatic estimates of the peak frequency. Figure 2d shows the predicted PGA from ground motion data using the Pv + IAA + peak F ANN regression model. The lnErr v considering peak F is 0.5206, which is slightly smaller than the one of the Pv + IAA model. (5) Pv + IAA + HVSR17; this model considers two P-wave parameters and the HVSR17 values. Similarly, since the HVSR17 values are the average of all earthquake events recorded at the same station, these values of two vibration events are the same as long as they were recorded at the same station. Figure 2e shows the predicted PGA of ground motion data using the Pv + IAA + HVSR17 ANN regression model. The lnErr v considering HVSR17 is 0.4972, which is evidently the smallest one. Figures 3a -d show the HVSR17 curves of typical stations of four different site classes with a large number of ground motion records, i.e., CHY102, HWA028, HWA014, and CHY076, respectively. The thin line in each figure represents each HVSR17 curve plotted based on each ground motion record from the corresponding station, while the bold line in each figure represents the average of all HVSR17 curves at that station. It is evident that the HVSR17 curves can represent the distinct site characteristic of each station. It is probably the reason of obtaining smaller PGA prediction error because such fruitful information of the site characteristic at each station is considered in the prediction model. (6) Pv + IAA + ALL (Pv + IAA + NEHRP + Vs30 + peak F + HVSR20); we also tried to combine all the proxies in the ANN prediction models to see if better prediction can be obtained. Figure 2f shows the predicted PGA of ground motion data using the Pv + IAA + ALL ANN regression model. It indicates that the lnErr v value of the ANN model with all the proxies is the second best one. This makes sense because too many proxies could confuse the ANN to predict PGA if some proxies are not strongly correlated to the site characteristics.

DISCUSSION AND CONCLUSION
Based on the results, including the average shear wave velocity of soil, i.e., Vs30, does not improve the PGA prediction accuracy. A similar phenomenon is observed when the NEHRP site class value is included. This is reasonable because the NEHRP site class value is also determined based on the shear wave velocity of soil. When including the peak frequency, the results are only slightly improved. On the other hand, the ANN model with Pv + IAA + HVSR17 input showed the best PGA prediction performance because it yielded the lowest STD values compared with the results of all the other ANN models. Therefore, it can be concluded that the proposed HVSR17 parameters contribute to the accuracy of PGA prediction of on-site EEW systems.
Many regional EEW techniques predict PGA using attenuation law after the magnitude and hypocentral distance are estimated. Based on the general attenuation relationships between horizontal PGA and hypocentral distance of hard rock sites in Taiwan, the STD of the logarithm residuals, i.e., lnErr v , is approximately 0.78 (Jean et al. 2006). If the site amplification modification based on empirical regression of earthquake data at each station is employed, then the STD of the logarithm residuals is reduced to approximately 0.537 (Chang 2002). In this study, we predict the PGA using ANN based on P-wave and HVSR17 parameters at the same site. The STD of the logarithm residuals is approximately 0.497. Besides, the reduction of STD of the logarithm residuals after considering site effects using HVSR17 is not as impressive as the one of attenuation relationship. This is probably because that the P-wave parameters used to predict PGA are measured at each site, which implies these parameters are already the product of the site effects. As a results, the on-site EEW implemented in this study somewhat already considers site effects. However, evidently, including the HVSR17 can achieve better prediction of PGA. Nevertheless, comparing to the regional approach, the onsite approach we proposed actually reaches comparable PGA prediction accuracy.
Although the Pv + IAA + HVSR17 model achieves the best results, the uncertainty of the PGA prediction in Fig. 2e still looks quite large, especially for the ground motion data with large measured PGA. Therefore, we tried to observe the performance of the data of different earthquake events with large PGA. We found that the predicted PGA of the Chi-Chi ground motion data recorded on 21 September 1999 in Taiwan, as can be observed in Fig. 4a, generally is remarkably underestimated comparing to the real PGA. As for all the other earthquake events shown in Fig. 4b, it is evident that the predicted PGA are fairly distributed both in the underestimated and overestimated sides. The STD of the logarithm residuals is further reduced to 0.4823 if the Chi-Chi ground motion data are excluded. This explained the large uncertainty of the predicted PGA is mainly due to the Chi-Chi earthquake whose rupture of the major asperity was 13-seconds after the minor asperity rupture (Ma et al. 2001). Therefore, based on the first few seconds of P-wave of Chi-Chi earthquake become not possible.
(a) (b) Fig. 4. PGA prediction of data of (a) Chi-Chi earthquake event, (b) all the other earthquake events using the Pv + IAA + HVSR17 ANN regression model.