Research ArticleOCEANOGRAPHY

Projected 21st century changes in extreme wind-wave events

See allHide authors and affiliations

Science Advances  10 Jun 2020:
Vol. 6, no. 24, eaaz7295
DOI: 10.1126/sciadv.aaz7295


We describe an innovative approach to estimate global changes in extreme wave conditions by 2100, as a result of projected climate change. We generate a synthetic dataset from an ensemble of wave models forced by independent climate simulation winds, enhancing statistical confidence associated with projected changes in extreme wave conditions. Under two IPCC representative greenhouse gas emission scenarios (RCP4.5 and RCP8.5), we find that the magnitude of a 1 in 100-year significant wave height (Hs) event increases by 5 to 15% over the Southern Ocean by the end of the 21st century, compared to the 1979–2005 period. The North Atlantic shows a decrease at low to mid latitudes (≈5 to 15%) and an increase at high latitudes (≈10%). The extreme significant wave height in the North Pacific increases at high latitudes by 5 to 10%. The ensemble approach used here allows statistical confidence in projected changes of extremes.


Human activities in both coastal and offshore regions are strongly affected by local wind-generated wave (wind-wave) events. There has been increasing interest in changes in wind-wave climate both historically and for future projections (1, 2, 3). Long-term satellite measurements have shown that wind-wave climate is changing over the global oceans (1). Coordinated international efforts (4) have been focused on collecting and analyzing international datasets and models to understand mean and higher percentile global ocean wind-wave climate. Although understanding how mean conditions are projected to change is important, it is the extreme wave conditions that are critical for both coastal and offshore areas. The design sea state for offshore and coastal structures is typically defined as the maximum significant wave height that can be expected over an N-year period (5). Offshore structures, such as oil rigs, or wind farms, and coastal protection structures (rock breakwaters and sea walls), usually consider the 1 in 100-year significant wave height, Hs, as the design sea state.

Wind-wave extremes are also crucial for the determination of coastal sea levels and coastal erosion, and changes in the climate may further exacerbate the already predicted strong societal and economic impacts of wind-waves on the world’s coasts (6, 7). It has been estimated (8) that in 2010, 290 million people worldwide lived below the 100-year flood level and US$9600 billion of assets were exposed to inundation. Therefore, it is of paramount importance to understand the implications of climate change on extreme wind-wave conditions.

In many scientific fields, the one in N-year values (i.e., the return values) of extreme events is estimated using extreme value analysis (EVA) (9). Long and homogeneous time series are needed to perform a robust analysis and confidently estimate these extreme values. However, spatial and temporal sampling (10, 11), along with observational bias issues (12, 13, 14, 15, 16, 17), limit estimates of extreme significant wave heights. Thus, extreme value estimates are typically characterized by large statistical uncertainty.

Dynamical and statistical analysis approaches have been applied to investigate possible changes in the wind-wave climate from projections of the mean and percentile values (2, 3, 18), yet, aside from a few studies (19, 20, 21, 22), a reliable analysis of the projected changes in wind-wave extremes is lacking. This results in a lack of consensus on whether the frequency or magnitude of these extreme events is affected by climate change.

The present analysis aims to quantify how the magnitude and frequency of wind-wave extreme events may change under future projected climate scenarios. Hence, we need to determine the difference between historical estimates of extremes and future projections. Determining these differences when both the historical estimates and future projections of extremes contain large statistical uncertainty is problematic. Therefore, here, we use an ensemble approach that can reduce uncertainties attributable to standard EVA methods (i.e., enhance confidence in the estimated extreme values), as well as quantifying intermodel and interscenario uncertainties (3, 23, 24, 25). In this approach, we analyze an ensemble of Global Climate Models (GCMs) used to force a wave model, as independent realizations of ocean wave conditions, thus enabling the pooling of the highest extreme wave conditions (after applying bias correction to reduce the effect of different GCM forcing; see Materials and Methods). This approach forms a large dataset that increases the confidence of the statistical estimates thereof. This approach has been shown to work well when applied to single-model ensemble forecasts of historical extreme sea states (26, 27, 28, 29) but has not previously been applied to extreme values extracted from different GCMs and to future projections of global extremes. Although at the present resolution, GCMs are not able to resolve localized extreme events such as the tropical cyclones (TCs) (20, 22), the strength of this approach is that it produces more robust results compared to previous studies that have attempted to address the intrinsically uncertain nature of the ocean-atmosphere state (30).


Model ensemble

We investigated changes in global 100-year return period significant wave heights, Hs100, over the 21st century, from an intermodel ensemble of seven global wave model runs [using the same WAVEWATCH III v3.14 (31) wave model] forced with independent GCM surface winds, part of the international Coupled Model Intercomparison Project 5 (CMIP5) (32). The present ensemble is part of a larger grouping of models (3), which has been used to examine projected changes in mean and percentile wave conditions. The subset of seven models that was chosen as the estimation of extreme conditions requires at least 6-hourly data over the full duration of the model runs, in contrast to the estimation of mean and percentile projections. These 6-hourly data were available for this subset, which was previously tested for its wind-wave simulation skill (33). The smaller number of ensemble members renders the present analysis computationally feasible. The subset was further tested to ensure that it was representative of the larger ensemble (see Materials and Methods), reproducing comparable mean and 99th percentile climatology, as well as projected changes, in both quantities by 2100 (see fig. S1).

The ocean wave extremes were pooled from the ensemble of global wave model runs, which collectively represent an equivalent time period much larger than commonly adopted EVA approaches (see Materials and Methods). To be able to pool the model results in this manner, the data must be independent and identically distributed (i.i.d.) (9). In the present case, we made the pragmatic assumption that the model results are independent, as each wave model run is forced with a different GCM, independently initiated. To be identically distributed, the models must generate comparable extreme values (see Materials and Methods) (29). These assumptions are verified by pair-wise model comparisons as shown in fig. S7.

The tail of the empirical probability distribution function (epdf) of the synthesized dataset was constructed as in fig. S2, which demonstrates that there are contributions from each of the seven models (identically distributed; see Materials and Methods). Although all models contribute as shown in fig. S2 (B and C), the contributions are not uniform, as shown by the nested gray bar plots (fig. S2, B and C). This occurs because the stacked colored bars represent the 1000-peak ensemble of non–bias-corrected Hs model results. A bias correction (see Materials and Methods), performed using the Z standardized variable, largely reduces the uneven contribution of GCMs such as the MRI-CGCM3 (fig. S2, D and E). After the bias correction, we fitted a probability density function (fpdf) to the pooled data and interpolated the 100-year return period, as is commonly done in the peaks over threshold (PoT) EVA approach (Materials and Methods). For clarity, we refer to the probability distribution function of the synthesized data as epdf and to the probability density function fitted to these data as fpdf. Note that with the ensemble dataset, the 100-year value can be determined by interpolation, rather than extrapolation, which would be necessary in a conventional EVA. In the case of the ensemble analysis, the required probability level (100-year value) is within the sampled probabilities.

The performance of the intermodel ensemble approach is demonstrated in Fig. 1. This figure shows the Hs100 estimates obtained with the model ensemble, compared with conventional PoT approaches applied to both altimeter data (34) and a single-wave model forced with National Oceanic and Atmospheric Administration (NOAA) Climate Forecast System Reanalysis (CFSR) winds (35). The comparison of the Hs100 values obtained from both the CFSR-forced wave model run and the satellite dataset shows the level of uncertainties around the extreme estimates (Fig. 1). The wave model forced with the CFSR high-quality wind field, which has been calibrated against in situ and remote ocean observations (35), acts as a model benchmark to compare with the climate simulations. These climate simulations run just with prescribed boundary conditions, without any in situ or remote sensing observation assimilated throughout the years. Note that this CFSR-forced reanalysis simulation is independent of the ensemble used (i.e., not a member of the ensemble). Figure S3 shows the differences between the model ensemble and the CFSR-forced global distributions of Hs100. Although there are differences of magnitude between the various approaches, the spatial distributions obtained with the intermodel ensemble of pooled extremes (Fig. 1A) is remarkably similar to the other datasets processed using traditional EVA approaches (Fig. 1, B and C). The main differences are found in the TCs areas, where both the GCMs and altimeter data are unable to adequately model extremes due to their coarse resolution (22) and sampling density (34). The spatial distribution and magnitude of the extreme values estimated from the intermodel ensemble are also remarkably similar to previous global significant wave height EVAs (24, 28, 29, 36), providing further support for the multimodel approach.

Fig. 1 Magnitude of the 100-year return period significant wave height Hs100[m] resulting from three different dataset EVAs.

(A) The 1979–2005 intermodel ensemble of the bias-corrected Z standardized variable highest peaks (see Materials and Methods), pooled from global wave model runs forced with seven GCM surface winds. (B) The 1985–2018 calibrated altimeter dataset using the PoT approach with exponential distribution fit. (C) The 1979–2005 peaks over 99.6th percentile threshold for the single global wave model run forced with a reference NOAA CFSR wind speed (exponential distribution fit).

Confidence limits for the magnitude of the Hs100 values vary because of different dataset characteristics and statistical approaches. The advantage of pooling extreme wave data from an intermodel ensemble of wave models is the reduction of the intrinsic uncertainties connected to EVA estimates (25). The pooled dataset, in effect, extends the length of the available model time series, being representative of a time window longer than the return period sought (1 in 100 years, in this case). This increases the level of confidence in the extreme value estimates in comparison to common EVA approaches as shown in fig. S4, where a comparison of 95% confidence limits for Hs100 obtained for both the intermodel ensemble and the single CFSR-forced model indicates a reduction in the magnitude by a factor of approximately 3 (see Materials and Methods). As a result, confidence in the determination of projected changes in significant wave height extremes by the end of the 21st century using the intermodel ensemble is also improved.

Extreme value projections

In this study, a historical (1979–2005) intermodel ensemble was compared with an intermodel ensemble of future projections at the end of the 21st century (2081–2100) for two different Representative Concentration Pathways (RCPs): RCP4.5, intermediate emissions scenario, and RCP8.5, high-emissions scenario (37). We determined the global Hs100 differences between these two periods for both emission scenarios (Fig. 2), with model extreme values bias-corrected using a standardized variable correction technique, which largely reduces the effect of GCM differences, caused by differences in both physics and model resolution (see Materials and Methods).

Fig. 2 Percentage change in the 100-year return period significant wave height by the end of the 21st century relative to the 1979–2005 period.

Standardized variable analysis approach is used for the intervals 1979–2005 and 2081–2100. (A) RCP4.5 mid-emission scenario. (B) RCP8.5 high-emission scenario. The regions with statistically significant changes at 5% level (see Materials and Methods; Eq. 7) are hatched.

Results show that the Southern Ocean is characterized by an overall increase in the Hs100 at the end of 21st century for both emission RCPs (Fig. 2). In this region, the intermodel ensemble estimates show increases of approximately 5% (RCP4.5) to 15% (RCP8.5) for Hs100. The South Pacific shows decreasing wave extremes around 30°S and a slight increase in the significant wave extremes around 10°S. This indicates a possible strengthening of the trade winds by the end of the 21st century, in agreement with recent studies (38). Increases between 5 and 10% (both RCPs) are indicated in the Indian Ocean west of Australia and in large areas of the South Atlantic. The changes highlighted above are all statistically significant, as indicated by the hatching in Fig. 2. The hatching is performed as explained in Materials and Methods and in fig. S5. The Northern Hemisphere shows a more irregular pattern. The models predict increases in Hs100 at the high latitudes of both the North Pacific and North Atlantic. However, tests of statistical significance (see Materials and Methods and Fig. 2 hatching) indicate that the North Atlantic increases are not statistically significant. The projected decrease in extreme significant wave heights of −5 to −15% in the mid and low latitudes of the North Atlantic basin is, however, statistically significant.

Uncertainty assessment

To account for differences in the wave climatology across the models in the ensemble, a bias correction approach using a Z standardized variable (Materials and Methods) was used. To investigate the projected changes in extreme significant wave heights in more detail, we examined the epdf of this Z standardized variable (Materials and Methods) at specific locations (fig. S6). This Z ensemble strongly reduces the uncertainties associated with the Hs100 estimates from both the historical 1979–2005 dataset and the two future projection scenarios. Figure S6 (A, D, and G) shows the 95% confidence intervals of the Hs100 estimates at three selected locations from the following: the ensemble of models, each individual model, and the average across the models. The confidence limits represented by the error bars in the graphs are considerably smaller for the ensemble approach than for any of the single models when analyzed using a common PoT analysis. Figure S6 (B, C, E, F, H, and I) shows epdfs from 5° × 5° regions around these three specific locations, the averaging over the region being used to further reduce statistical variability in the epdfs. The Southern Ocean region around location (120°E, 50°S) is an area where the extreme significant wave height is projected to increase. The epdfs (fig. S6, B and C) show that this occurs because of an increase in both the frequency and the magnitude of extreme events for both RCP4.5 and RCP8.5. The changes in magnitude are depicted by a shift toward the right of the future scenario epdfs if compared to the histogram of the historical dataset. The changes in the Southern Ocean event frequency are shown by a translation of the future scenario epdfs along the y axis (fig. S6B). The quantile-quantile (Q-Q) plot (fig. S6C) analyzes the details of these changes. Here, the changes in frequency are shown by the departure of the future scenarios red and orange quantiles from the 1:1 line in the region of plot represented by the bulk of the dataset.

The error bars show that the projected changes by the end of 21st century are statistically significant for this location (i.e., the error bars do not overlap; fig. S6A). The North Pacific region around location (190°E, 50°N) is also an area where the extreme significant wave height is projected to increase. In this case, the comparison with the historical dataset (fig. S6, E and F) shows that the epdfs of the two projected emission scenarios are similar in terms of frequency, but there is an increase in the magnitude of extreme waves by the end of the century. The confidence limits (fig. S6D) indicate that this change is statistically significant for both future scenarios. The North Atlantic location (320°E, 40°N) is representative of an area where the extreme significant wave height is projected to decrease. The 5° × 5° region epdfs (fig. S6, H and I) show that this decrease is a result of a decrease in both the frequency and magnitude of the extremes for both RCP4.5 and RCP8.5 emission scenarios. The confidence intervals, however, indicate that the projected changes are not statistically significant (fig. S6G).

Change in frequency of extremes

An additional investigation of changes in the frequency of extreme events was undertaken by determining, for each single-wave model, the number of extreme events above defined thresholds for both the historical (1979–2005) and future projection (2081–2100) datasets. For each model, the threshold was set as a percentile of the historical dataset. The number of extremes above this value, for both the historical and future projection datasets of each model, was obtained, and then the seven-model difference in the number of the extremes (count/years) between historical and future projection was computed (see Materials and Methods). Initially, we tested a 99.6th percentile threshold, which approximately corresponds to the percentile, over the total number of model outputs, of the 1000 storm events pooled for the EVA performed at each location of the globe. To confirm the consistency of the results, we also tested a 90th percentile threshold. Figure 3A shows the change in the number of events above the 90th percentile level, and Fig. 3B shows the case for the 99.6th percentile threshold. For the Southern Ocean, the increase in the Hs100 is associated with an increase in the number of extreme storms. This is a consistent result for both thresholds (Fig. 3). Thus, the change in frequency of these events affects the return period estimates. In the North Pacific, we found an increase in significant wave height at the end of the 21st century (Fig. 2), but there is no significant increase in the frequency of occurrence of these events (Fig. 3). Here, it is an increase in the magnitude of the events that is responsible for the changes in the 100-year return period significant wave height. The projected decrease in values in the North Atlantic is associated with a decrease in the frequency of the extremes (approximately 1 event less per year for the 99.6th percentile threshold).

Fig. 3 Changes in extreme event frequency.

(A) Changes in the number of extreme events per year over the historical 90th percentile threshold between the historical dataset 1979–2005 and future projection 2081–2100 for RCP8.5. (B) As for (A) but for the historical 99.6th percentile threshold. The similar spatial variability demonstrates consistency of the results for different selected thresholds.

TC-generated wind-wave extremes

TC-generated wind-wave extremes deserve separate discussion as the coarse resolution of the GCMs, and the physics schemes used may limit the representation of TC extreme winds, and consequently the modeled wind-wave extremes. The present seven-model ensemble dataset performance in reproducing TC wind waves was analyzed and validated against observations by Shimura et al. (39) in the western North Pacific region. The analysis shows that three of seven models can represent at least 80% of the TC frequency, as well as winds more than 30 m/s. These models are BCC-CSM1.1, MIROC5, and MRI-CGCM3. The ability of these models to depict TC wind-wave extremes is shown in fig. S8, where, for selected National Data Buoy Center (NDBC) buoy locations, we compare the bias-corrected seven-model ensemble wave heights with buoy and satellite altimeter wave height measurements. In addition, in fig. S10, an EVA is performed for each of the wave model runs for the 1979–2005 historical period to show the differences between each model EVA performance in TC regions. The seven-model ensemble has only limited accuracy in representing these events and, as can be seen in fig. S8, underestimates extremes in TC regions. However, it is useful to investigate the magnitude and frequency changes (Figs. 2 and 3) in major TC regions [the western and eastern North Pacific; the North Indian, South Indian, and South Pacific; the Caribbean/Gulf of Mexico; and the open North Atlantic (], and how the GCM projections perform in these regions. Figure 2 shows statistically significant decreases in the magnitude of Hs100 in TC regions, except for the North West Pacific. Figure 3 shows that the frequency of TC extreme wind-wave events decreases globally by 2100. According to the recently published Intergovernmental Panel on Climate Change (IPCC) Special Report on the Ocean and Cryosphere in a Changing Climate and reference reviews on global future projections in TC events (40, 41), there is medium confidence on a general future projected increase in category 4 and 5 TC events. If this is the case, Hs100 future projected values should also increase as a result of the shift in magnitude of TC events. This differs from the results obtained with the present seven-model ensemble EVA (Fig. 2). However, there is also emerging evidence (low confidence) on a reduction in the frequency of TCs (41), which is in agreement with Fig. 3 results. It should be noted that both the intensity and frequency of the 1000 pooled peaks will affect the EVA. Thus, the reduction in frequency would partially counteract the increase in TC intensity, thus reducing the Hs100 future projected estimates.

As noted above, global wind-wave extreme projections in TC regions are affected by CMIP5 GCM resolution and model physics that limit the confidence around the estimates of TC wind-wave events (40). However, the present EVA approach can infer some information on TCs, derived from a limited number of models contributing to the ensemble. In the future, this approach could take advantage of CMIP6 higher-resolution GCMs, with improved physics, to force a high-resolution global wave model [horizontal resolution of at least 0.25°; (22)], to estimate TC-generated wind-wave extreme future changes. In this way, the ensemble approach performance would be significantly improved, as a much larger amount of information on TC wind-wave events would be embedded in the pooled highest peaks. In that case, we could estimate changes in TC-generated wind-wave extremes, as opposed to non–TC-generated waves, arguably still dominant in the present ensemble dataset (39). At the present level of accuracy, special care must be taken in interpreting the changes in Hs100 in TC regions.

Change along global coastlines

A potential consequence of the projected changes in extreme wave conditions will be enhanced shoreline erosion and possibly coastal flooding in some regions. Figure 4 shows the projected changes in Hs100 along the world’s coastline in 2100 for RCP8.5. This was determined by assigning the change in Hs100 at the closest model point to the coastal segment. This assessment ignores potential nearshore refraction and shoaling at this global scale. Table S1 shows the lengths and percentage of global coastline changes in Hs100 in 5% increments. A total of 59% of the world’s coastlines are projected to experience an increase in extreme wave conditions (table S1). Consistent with Fig. 2B, high-latitude regions of both hemispheres are projected to have an increase in extreme significant wave height, while lower latitudes generally see a decrease. It should be noted, however, that because of model resolution, TC-generated waves are poorly resolved. Therefore, there is less confidence in the projected changes in Hs100 in these tropical regions (30°S to 10°S, 10°N to 30°N; see Materials and Methods). Although Fig. 2B shows increases in Hs100 in the Southern Ocean up to 20%, these large increases are generally at latitudes further south than most of the continental coastlines. The southern tip of South America is projected to experience an increase of approximately 20%, with the west coast of New Zealand and Tasmania experiencing an increase of 10 to 15%. Increases of 10 to 15% are also projected for the coasts of the North Pacific (Canada and Kamchatka Peninsula). As shown in Fig. 2 (see Materials and Methods), changes in the magnitude of Hs100 less than 5% are generally not statistically significant.

Fig. 4 Percentage change in 100-year extreme value significant wave height along the global coastline between the historical dataset 1979–2005 and future projection 2081–2100 for RCP8.5.


The ensemble approach adopted here has the major advantage that the resulting values of Hs100 have considerably smaller confidence intervals than if estimates from a single model were used. This is an advantage when investigating changes in values of Hs100. The reduction in the magnitude of the confidence intervals can be seen at specific locations in fig. S6 (A, D, and G) and globally in figs. S4 and S5 (D and E). This allows the determination of projected changes in extremes with much greater statistical confidence. As a result, the present analysis has been able to estimate statistically significant changes in extreme significant wave height on a global basis. Although projected changes in extreme events appear robust, intramodel uncertainties and uncertainties that arise from atmospheric downscaling and wind-wave modeling are not addressed by this analysis (3). Furthermore, some regions of the global oceans affected by local climate variability, such as TCs, present higher uncertainty due to model limitations in representing these natural phenomena at the current model resolutions (20, 21, 22, 39, 40, 41). Whether TCs are increasing in frequency or magnitude remains an open question (40, 41, 42, 43). However, increasing resolution and improved physics of next-generation models will further address remaining sources of uncertainties, with increasingly accurate estimates of the changes in future projected extreme significant wave heights. Furthermore, we advocate for a continuous archiving of data from GCMs to ensure long climate record. These GCMs are already run over centennial time scales, but high temporal data are archived only for limited time slices.


Intermodel ensemble of ocean storms

An ensemble of global wave model runs of WAVEWATCH-III (WWIII v.3.14) (31), forced with surface winds simulated by different GCMs, was used to develop a dataset of storm wave conditions. These GCMs are part of the CMIP5, feeding into the Fifth Assessment Report of the IPCC. We selected the seven WWIII model runs that performed best in terms of wave climate prediction (33) and applied an EVA (9) to model output. The GCMs used to generate the surface winds, which forced the WWIII wave model (CMIP5/WWIII), were as follows: ACCESS1.0, BCC-CSM1.1, GFDL-CM3, HadGEM2-ES, INMCM4, MIROC5, and MRI-CGCM3 (33). The characteristics of each model are described in table S2. The data are available on the CSIRO data servers [Collaboration for Australian Weather and Climate Research (CAWCR) dataset at].

As noted above, this ensemble is a subset of the larger group used by Morim et al. (3) to investigate changes in mean and 99th percentile significant wave height by 2100. The smaller group was used here as the present analysis of extremes required at least 6-hourly data rather than monthly statistics. These data were archived for these seven models. In addition, the hourly data mean an increase in the amount of data by a factor of approximately 700 per model. Hence, the smaller ensemble was also required to render the problem computationally tractable. The present subset of models is drawn from clusters 2 and 4 of Morim et al. (3), which were broadly representative of the larger ensemble of models. However, to confirm that this is the case, the climatology for the subset was determined as the mean and 99th percentile significant wave height, H¯s and Hs99, for the historical time period (1979–2005). These results are shown in fig. S1 (A and B) and are comparable to figure 1 of Morim et al. (3). Similarly, the percentage changes in H¯s and Hs99 between the historical (1979–2005) and future (2081–2100) time periods for both RCP4.5 and RCP8.5 are shown in fig. S1 (C to F). Again, this is directly comparable with figure 1 (ΔH¯s) and figure 2 (ΔHs99) of Morim et al. (3). These comparisons provide confidence that the present subset of models is representative of the larger ensemble consisting of 83 models.

EVA technique

Our aim here was to pool the data from the ensemble of the seven bias-corrected models. To this end, first, a threshold was set at the 90th percentile for each of the seven models. Storm-independent peaks were then selected above this threshold, with independence guaranteed by ensuring that peaks are separated by a minimum of 48 hours (44). The peak values from the pooled dataset were then ranked, and the highest 1000 were selected (29, 45). The procedure was conducted for both raw model 6-hourly outputs and bias-corrected outputs (fig. S2, D and E). As for a PoT approach, we fitted an exponential distribution to the 1000 highest peaks and found the 100-year return period value at the desired probability level from the relationship (34, 46)P(x<x100)=1NY100NPOT(1)where NPOT is the number of peaks (1000) and NY is the equivalent duration in years of the pooled dataset from which the peaks were extracted. This process of selecting storm peaks from each of the models in the ensemble and pooling them to form extreme value histograms is shown diagrammatically in fig. S2. To perform the EVA, it is necessary to define a representative time period of the pooled dataset (i.e., define NPOT). Once the extremes from each of the models are pooled, we no longer have a single time series. Rather, we have a pooled set of extremes from multiple models, of which we verified the level of independence and identical distribution (fig. S7). Therefore, to compute the equivalent time period for this pooled dataset NY, we assumed that each of the model 6-hourly outputs are representative of a 6-hour time window; this value has been tested in previous literature (29, 45). Considering leap years for the historical dataset (1979–2005), the equivalent time is given by Eq. 2Teqhist=27 years × 365.25 days × 4 hindcasts a day × 6 hours × 7 models=189 years(2)and for the future projection (2081–2100), the equivalent time is given by Eq. 3Teqproj=20 years × 365.25 days × 4 projections a day × 6 hours × 7 models=140 years(3)

As discussed above and in the following sections, the synthesized time period guarantees smaller return period confidence intervals, reducing the uncertainties. Equations 2 and 3 define the values of NY in Eq. 1. Note that as NY is larger than the desired return period (100 years), we are “in sample,” and no extrapolation of the epdf is required. The exponential distribution fitted to the epdf of the data is hence used only to smooth the tail of the epdf and aid interpolation of the extreme significant wave height at the desired probability level (i.e., one in 100-year event; see fig. S2). A similar approach to that applied here has been used to estimate projected changes in extreme significant wave height at selected North Sea locations (26, 27). In contrast, to the present approach, however, this previous analysis used only annual maxima from a single model.

Preliminary test on raw model data

Data from the ensemble of models can be pooled to undertake EVA only if the data comply with specific criteria (29, 45). That is, the data must be i.i.d. (9). To investigate whether the data are i.i.d., we selected several locations around the globe and constructed scatter plots of significant wave height values for the historical model datasets. The scatter plots are for values at the specific locations and the same times. Plots were performed between each combination of two models (fig. S7). As the models are forced with different wind fields, the scatter plots show low correlation (correlation coefficient < 0.13), and it is assumed that the models are sufficiently independent to perform the analysis (29, 45). In addition, Q-Q plots (fig. S7) show that generally the models are identically distributed, i.e., Q-Q line is close to the 1:1 line, indicating similar epdfs. This is particularly true for the bulk of the data, with greater divergence at the extremes. To further test the relationship between the models at the extreme values, we extracted the peaks over the 90th percentile for each m model and compared the quantiles with the quantiles of the intermodel ensemble of extremes taken above the same percentile. We then computed the extreme quantiles root mean square error (RMSE) to define a performance parameter eRMSE (Eq. 4)eRMSE=1RMSEmnHs100Hs100(4)

The performance parameter eRMSE (Eq. 4) describes the similarity (or discordance) between model distributions of the extremes. The smaller the eRMSE, the more similar the distribution of extremes between models, and thus the more robust is the ensemble EVA approach. RMSEm is the root mean square error of the m model 90th percentile extreme quantiles in relation to the intermodel ensemble 90th percentile extreme quantiles, where n is the number of models (in this case, 7), and Hs100 is the 100-year return period value estimated with the intermodel EVA approach (i.e., respectively for historical and future projection datasets).

The eRMSE results for both historical and future projection datasets (fig. S9) show reasonable agreement between model extremes (eRMSE < 0.05), except for the tropical areas, affected by TCs, which are still not adequately represented by the models, due to coarse resolution and missing physics. These results show that some GCMs seem to capture TC characteristics, while others still miss them. This is shown by the EVA performed on each single global wave model run over the 1979–2005 historical time slice (fig. S10).

Robustness of the results

To account for different biases and variances between the various models, we tested two bias correction approaches and a bias-uncorrected approach. As outlined above, we describe the results obtained from a standardized variable correction, which, we believe, works quite well. This approach allows a comparison between models with different biases and variances (47) by standardizing the significant wave height values to each model’s historical mean μmhistand SD σmhist (here, m = 1, 2, …, 7 for the seven models). This is done for both historical (Eq. 5) and future projection (Eq. 6) significant wave height values Hs, mZmhist=Hs,mhist  μmhistσmhist(5)Zmproj=Hs,mproj  μmhistσmhist(6)

The values of Hs were standardized for each of the seven models individually, and then the storm data (extreme values) were pooled. The EVA was then conducted on the 1000 highest Z values extracted from the pooled Zm variables. We then found the 100-year Z100 return period and from this computed Hs100, inverting Eqs. 5 and 6. The inversion was done using the seven-model ensemble historical mean, μenshist, and SD, σenshist. The changes Hs100 between historical and future projection datasets are shown above in Fig. 2.

A second analysis directly compared the extreme significant wave height values from the intermodel ensemble of the historical dataset with the extreme significant wave heights of the intermodel ensemble of future projections. No bias correction was used in this analysis. The PoT analysis was performed on the 1000 highest significant wave height peaks selected as described above. The 100-year return period changes at the end of the 21st century (fig. S11A) show a similar spatial distribution to the Z-corrected results (Fig. 2), especially at the mid and high latitudes of both hemispheres. The main differences are found in the tropical areas where, as already mentioned above, the models have limitations in the representation of extremes (TC areas in fig. S10).

In a third approach, we extracted the 1000 highest peaks from bias-corrected historical and future datasets. We bias-corrected each of the CMIP5/WWIII outputs at the 99.8th percentile level, based on the 99.8th quantile of a reference climate model run. That is the WWIII wave model forced with surface winds from the CFSR of the American National Centers for Environmental Predictions (CFSR/WWIII run). The delta found from the 99.8th percentile quantile of the reference runs was then added or subtracted to the cumulative distribution function of each of the CMIP5/WWIII runs, shifting the tail of the extremes to match the chosen quantile level. This process is similar to a correction with a quantile-based mapping method (48) but limited to a single quantile to preserve the variance of each model tail of the extremes. The changes in Hs100 (fig. S11B) using this approach show greater variability than the first two approaches, but the general distributions are similar, especially at the mid and high latitudes of both hemispheres, demonstrating the robustness of the results in these regions of the global oceans. Changes in the tropical areas remain an open question, and this bias correction approach introduces larger deviations in the results for these areas.

In addition to this, we test the consistency of Hs future changes, performing an EVA at different return levels. These correspond to the 1 in 10-, 20-, 50-, and 100-year return periods. The EVA extreme estimates for each return level are shown in fig. S12, with the percentage of change of the 2081–2100 RCP8.5 projections relative to the 1979–2005 period. The results show the different magnitude of the estimates for both historical and future projection datasets, from the smallest (10-year return period) to the largest (100-year return period). The percentage of change in significant wave height by the end of the 21st century, relative to the 1979–2005 period, is consistent through the different return period analyzed. The 10-year return period results (fig. S12) are in agreement with Wang et al. (19) in the Southern Ocean, the eastern tropical Pacific, and the North Atlantic.

Model contributions

An important element of the pooled model approach is that all the models contribute to the EVA estimates. That is, the peaks composing the tail of the pooled extreme epdf come from all CMIP5/WWIII runs, rather than a small number of models dominating the selected extreme values. To test this, the model contributing each of the peaks in the ensemble EVA was tracked. The number of peaks from each model making up the highest 1000 Z values in the pooled extreme dataset was determined at selected test locations. The model contribution to the 1000 peaks for non–bias-corrected data is shown for these locations as pie charts (fig. S13). This figure shows that even without any bias correction, all seven models contribute to the fpdf and thus the return period estimation. The bias correction process further ensures an even more uniform contribution to the peaks across the individual models of the ensemble, as shown in fig. S2.

Statistical significance

Once the 1000 peaks are extracted from the intermodel ensemble, we can bootstrap the confidence intervals at the 95% level (49). The bootstrap technique is performed by resampling 500 times the 1000 pooled peaks with replacement. Each of the 500 samples is composed of 1000 values. With replacement, a bootstrap sample may have some values of the original 1000 peaks repeated and some not appearing (49). Then, for each grid point, we compute the 100-year return value fitting the exponential distribution to each of the 500 samples. The 95% confidence level is given by the 0.025 and 0.975 percentiles of the 100-year return values found. The confidence intervals for both historical and future projection Hs100 (fig. S5, D and E) are significantly smaller than results found with common EVA techniques (see fig. S6, A, D, and G) (28, 29). Taking advantage of the reduced statistical uncertainty, we here introduce a test to evaluate the statistical significance of the extreme significant wave height changes at the end of the 21st century. The significant wave height changes at the end of the 21st century are regarded as statistically significant if the 95% confidence limits of the historical and future extremes do not overlap, that isHs100>0.5 × (CI95,histCI95,proj)(7)where Hs100 is the difference between historical and future projections of Hs100 (Fig. 2) and CI95 are the confidence intervals (i.e., the range of values in which we are 95% confident to find Hs100 for the historical CI95, hist and the future projection CI95, proj intermodel ensembles).

Statistically significant values of changes in Hs100 are shown with hatching in Fig. 2. As can be seen in this figure, most areas where the changes in Hs100 are greater than 5% in magnitude (positive or negative) are statistically significant. One of the major advantages of the pooled ensemble approach used here is that the reduction in confidence limits of the extreme value estimates means that differences in extreme values can be determined with greater statistical confidence.

A further demonstration of the increase that is statistical confidence due to the pooled ensemble approach is demonstrated in fig. S14, which shows the changes in Hs100 by the end of the 21st century for RCP8.5, calculated using a peaks over 99.6th percentile threshold (PoT) for each of the individual models in the ensemble. Figure S14 can be directly compared with the ensemble result in Fig. 2B. Although some of the spatial variations evident in Fig. 2B are evident when considering the models individually, the increase in statistical variability is evident. Applying the same approach as above to determine statistical confidence, we find that almost none of changes in Hs100 are statistically significant at the 95th percentile level when using individual models.

A further comparative test was undertaken by evaluating the average of the changes in Hs100 from the individual models in fig. S12. This yielded a result comparable to the ensemble result (Fig. 2B), augmenting confidence in the robustness of the ensemble approach. However, as shown in fig. S6 (A, D, and G), the confidence limits using such an averaging approach are still much larger than for the ensemble approach.


Supplementary material for this article is available at

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: Funding: I.R.Y. acknowledges the support of the Australian Research Council through grant DP160100738. M.H. is supported by the Australian Commonwealth National Environmental Science Program Earth Systems and Climate Change Hub Author contributions: A.M. and I.R.Y. conceived the project. A.M. and I.R.Y. performed the technical analysis with contribution from M.H. R.R., A.M., and I.R.Y. wrote the manuscript with input from M.H. R.R. and A.M. performed model simulations and statistical analyses. E.K. performed the global coastlines analysis. Competing interests: The authors declare that they have no competing interests. Data and materials availability: The raw datasets used in the study were obtained from the CSIRO data servers (CAWCR dataset at All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.
View Abstract

Stay Connected to Science Advances

Navigate This Article