## Abstract

Many studies have sought to constrain climate projections based on recent observations. Until recently, these constraints had limited impact, and projected warming ranges were driven primarily by model outputs. Here, we use the newest climate model ensemble, improved observations, and a new statistical method to narrow uncertainty on estimates of past and future human-induced warming. Cross-validation suggests that our method produces robust results and is not overconfident. We derive consistent observationally constrained estimates of attributable warming to date and warming rate, the response to a range of future scenarios, and metrics of climate sensitivity. We find that historical observations narrow uncertainty on projected future warming by about 50%. Our results suggest that using an unconstrained multimodel ensemble is no longer the best choice for global mean temperature projections and that the lower end of previous estimates of 21st century warming can now be excluded.

## INTRODUCTION

In 2018, the special report of the Intergovernmental Panel on Climate Change (IPCC) on global warming of 1.5°C (*1*) stated that “human activities [...] have caused approximately 1.0°C of global warming above pre-industrial levels.” Climate change is now well under way. As the global warming signal increases, observations provide increasingly accurate information about how the climate system responds to an increase in atmospheric greenhouse gases (GHGs) concentrations and other human factors. Consequently, there is growing interest in monitoring ongoing climate change and providing near-term and long-term projections that are consistent with the latest observations.

First attempts to constrain projections from historical observations were based on detection and attribution (D&A) techniques (*2*–*5*), i.e., estimates of how much past warming is attributable to human influence on climate. Although recent such analyses reported a downward revision of the upper bound of projected warming (*6*, *7*), these studies primarily find a modest narrowing of uncertainties. As a result, the expected 21st century warming in response to various scenarios was calculated directly from model outputs in the IPCC Fifth Assessment Report (AR5) (*8*), i.e., without any observational constraint. Other studies tried to constrain metrics of climate sensitivity, such as the transient climate response (TCR; i.e., the global mean warming at CO_{2} concentration doubling, in response to a yearly 1% increase in CO_{2} concentration) or the equilibrium climate sensitivity (ECS; i.e., the global mean warming at equilibrium after a doubling of CO_{2} concentration) (*9*, *10*), using energy balance models (*11*), or directly from Earth’s energy budget (*12*). Although not all results agree (*9*), most of these studies also find a limited historical constraint and/or tend to modestly lower the range of values derived from climate models.

In an apparent contrast with the downward revision of the upper bound of the projected warming, the latest generation of climate models, from the Coupled Model Intercomparison Project Phase 6 (CMIP6) (*13*), exhibits higher sensitivity to CO_{2} increase than the previous generation (*14*) (about 10% higher using the set of models considered in this study). As a result, the question of consistency between model projections and historical observations becomes even more critical for this new generation of models (*15*). Do the levels of warming previously estimated in response to a range of emission scenarios need to be revised upward? Or do historical observations rule out some of these simulations?

In this study, we reassess human-induced warming ranges that are consistent with historical observations in the past, in the near term and in the long term, and in light of three new factors: newly available climate models, improved observations, and a new statistical method.

Observations of near-surface atmospheric temperature (SAT) have been subject to major improvements recently. Commonly used datasets of observed temperature were based on spatially incomplete observations (*16*) and sea surface temperature (SST) instead of SAT over oceans (*17*). Together, these two effects have been shown to artificially reduce estimates of past warming by about 15%, with implications, e.g., in terms of remaining carbon budget (*18*). Accounting for these recent findings and improved observed data is critical to constrain future warming.

The statistical method is another key source of improvement over previous work. Most previous studies rely on linear regression techniques (*2*–*7*, *10*) and do not consider model uncertainty in the spatial or temporal pattern (*19*, *20*). Other constraints based on historical warming often focus on the trend over a specific subperiod (*15*), potentially neglecting useful information. A third approach relies on simple climate models in which parameter uncertainty can be explored comprehensively and then subselecting those trajectories that are consistent with historical observations (*21*, *22*). Here, we use climate models of full complexity to provide a prior on the real-world forced response and then derive the posterior of this response, given the historical observations. This new statistical approach overcomes previous limitations: Model uncertainty is fully taken into account, and the entire observational record is used to constrain past and future responses to various external forcings in a consistent way.

Using these three components enables us to narrow uncertainties in climate change projections and suggests that the lower end of the IPCC AR5–projected warming ranges can now be excluded. Given the high costs and damages associated with climate change impacts, mitigation, and adaptation, providing more accurate climate change projections, as is performed in this study, may offer substantial benefits to society.

## RESULTS

### Attributing recent warming

We first estimate the externally forced [i.e., all (ALL) forcings] warming to date. While this has often been performed using linear trends (*7*, *23*), our technique can deal with a more complex temporal shape (determined by complex climate models) and account for some uncertainty on that shape. We find that the total forced warming is 1.09°C ± 0.14°C (all confidence ranges in this study are 5 to 95%), on average, over the 2010–2019 decade relative to the 1850–1900 preindustrial baseline (Fig. 1A) (*24*). This value is very close to the observed warming in GSAT (global and annual average of SAT) of 1.07 ° ± 0.07°C (considering observational uncertainty only) but exhibits larger uncertainty. If the calculation is performed for the year 2020 specifically (still relative to the 1850–1900 average), then the estimated total forced warming increases to 1.22 ° ± 0.16°C . These values are higher than previous estimates, which do not account for the effects of spatial coverage and/or SST blending (*1*, *25*).

We then use the technique to narrow uncertainty on the warming attributable to various external forcings (Fig. 1A). Focusing on 2010–2019 again, we find that a very small fraction of the total forced warming, 0.06 ° ± 0.02°C, is attributed to natural (NAT) forcings—partly explained by reduced volcanic activity over the past decade, while the 1850–1900 base period was affected by the Krakatoa eruption. The warming induced by anthropogenic (ANT) forcings is close to the total forced response: 1.03 ° ± 0.15°C. The warming induced by GHGs only is estimated to be 1.42 ° ± 0.31°C and is partly offset by a cooling induced by other anthropogenic (OA) forcings (aerosols, ozone, and land use changes) of −0.39 ° ± 0.27°C. The observational constraint is particularly pronounced for the ALL and ANT components (uncertainty reduced by ∼70% compared to the unconstrained ranges) and more modest for GHGs and OA (50 and 30% reduction, respectively). These results are consistent with previous assessment (*23*), although the quantification of temperature changes with respect to the preindustrial baseline is novel. Last, if compared to unconstrained ranges, then the constrained ranges essentially reject the highest responses to both GHGs and OA, and the smallest responses to GHGs are also ruled out.

A similar analysis can be applied to the 2010–2019 warming rate (Fig. 1B). The observed warming trend of 0.32 ° ± 0.01°C per decade (considering observational uncertainty only) is found to lie slightly above the upper end of the forced response (0.25 ° ± 0.05°C per decade). This suggests that internal variability has enhanced the warming trend over the past decade—consistent with the end of the global warming slowdown and a marked El Niño event in 2016. The current ANT-induced warming rate is estimated to be 0.22 ° ± 0.05°C per decade, which is slightly higher, and has narrower uncertainties than previous estimates (*1*, *25*). Note that these numbers are derived using the Shared Socioeconomic Pathway (SSP) 2-4.5 scenario to extend historical simulations after 2014 and that even higher values are found if the SSP5-8.5 scenario is considered (fig. S5). We find a substantial NAT-induced warming trend over the past decade of 0.03 ° ± 0.01°C per decade, driven by both solar and volcanic activities. The GHG-induced warming rate, 0.23 ° ± 0.06°C per decade, is very close to model estimates. The sign of the OA contribution over the past decade is very uncertain, but the highest simulated rates of OA-induced warming are found to be inconsistent with the observational record. This analysis suggests that the partial offsetting of GHG-induced warming by the OA forcing since the preindustrial period was no longer active over the past decade, consistent with stabilized aerosol emissions over that period (*11*, *15*).

### Constraining projections

We now examine the implications of the observational constraint on the responses to specific scenarios such as the SSPs (*26*). In this respect, our procedure has some advantages compared to other observational constraint methods. Using the entire observed time series avoids ignoring useful information and facilitates annual updates. This also helps distinguish the responses to GHGs from OA factors, thanks to the different emission time series for these two subsets of forcings. In addition, the entire time series of constrained projections is derived in a consistent way, which is particularly attractive for near-term projections (*23*). Last, our method enables us to provide confidence ranges specifically for the forced response, while many other studies also include internal variability.

We find that historical observations effectively constrain 21st century projections (Fig. 2). For all three scenarios considered, our observational constraint narrows the uncertainties (confidence interval widths) by a factor of about 2 at the end of the 21st century and by a factor of 3 in the short term (i.e., before 2040, consistent with the results for the externally forced warming to date). Such a large reduction in uncertainty is quite notable and is larger than that previously reported (*15*). Part of the discrepancy with other studies is related to the fact that our confidence ranges are representative of the forced response only, i.e., internal variability is filtered out, both before and after applying the observational constraint. But still, this finding raises questions about the robustness of our method, and whether it appropriately accounts for all sources of uncertainties (i.e., potential overconfidence).

To assess the robustness and predictive skill of our technique for the late 21st century warming, we perform a perfect model evaluation (*10*, *27*). We alternatively withhold one 1850–2100 transient run, treat its historical component (extended to 1850–2019) as pseudo-observations, and apply the method to predict the 2081–2100 SSP5-8.5 warming, using all other models to construct the prior—following a cross-validation procedure (more details are given in the Supplementary Materials). We use the coverage probability of our technique, i.e., the proportion of the time that the confidence interval contains the true value of interest, as a key criterion to evaluate its robustness. Results suggest that our statistical method is able to strongly decrease the uncertainty on future warming while preserving the desired confidence level (Fig. 3). We estimate a coverage probability of 86% giving equal weight to each single run (while 90% is expected), but this number is heavily influenced by the large CanESM5 (Canadian Earth System Model version 5) ensemble—a model that exhibits very high warming and is therefore more difficult to predict. Giving equal weight to each CMIP6 model (i.e., accounting for the unbalanced sizes of single-model ensembles), the estimate of coverage goes up to 91%, i.e., very close to the desired level. Evaluation of our method using probabilistic scores provides consistent results (fig. S6). Furthermore, the method is able to correctly predict values that lie near the boundary or even outside the assumed 5 to 95% prior uncertainty range [see, e.g., GFDL (Geophysical Fluid Dynamics Laboratory)–ESM4 and CanESM5 models in Fig. 3]. Overall, these results demonstrate that our method is not overconfident and that the constrained uncertainty ranges are reliable. Nevertheless, this figure also suggests that some model responses might be harder to predict than others, i.e., the coverage probability for individual models can deviate from the desired 90%—although the coverage probability is correct if averaged over all CMIP6 models.

Beyond the overall level of reduction in uncertainties, the analysis of constrained ranges provides further insight (Figs. 2 and 4). In response to a moderately low emission scenario (SSP2-4.5), the constrained warming in 2100 is estimated to be 3 ° ± 0.7°C with respect to 1850–1900 (as compared to 3.2 ° ± 1.2°C for the unconstrained set of models). This range rises to 4.9 ° ± 1.1°C (and 5.4 ° ± 1.9°C for the unconstrained) in response to a high-emission scenario (SSP5-8.5). Human-induced warming is projected to exceed 1.5°C between 2027 and 2047 (best estimate, 2034), under SSP2-4.5. This range is consistent with extrapolating the estimates of ANT warming to date and ANT warming rate reported above.

Consistent with previous studies (*6*, *12*, *15*), we find that the upper end of model projected warming is not consistent with historical observations. More generally, the observational constraint revises unconstrained model projections downward, as central estimates are also lower after the constraint than before. However, unlike previous studies, we find that both sides of the uncertainty ranges are affected by the observational constraint, suggesting that both the highest and the lowest sensitivities inferred from the CMIP6 ensemble are inconsistent with observations. This finding is consistent with our results for the past GHG-induced warming. Then, if compared to the projected warming ranges from the IPCC AR5 (Fig. 4), then our late 21st century constrained ranges based on CMIP6 lie toward the upper end of previous estimates: Upper bounds are essentially unchanged, while lower bounds are revised upward. So, overall, our observational constraints revises the warming simulated by climate models downward, while our observationally constrained projected warming ranges based on CMIP6 exclude the lower end of the AR5-assessed ranges.

### Implications for climate sensitivity

Last, our methodology can be extended to constrain metrics of climate sensitivity such as the TCR and the ECS. Using CMIP6 models, we find a constrained TCR range of 1.84 ° ± 0.51°C, as compared to 1.98 ° ± 0.74°C without the constraint. Consistent with projections, the constrained range primarily revises the unconstrained CMIP6 range downward but excludes the lower end of the IPCC AR5–assessed likely range. The lower bound of our constrained range (i.e., 1.33°C) is consistent with some recent studies (*21*) but is quite high if compared to others (*11*, *15*). One simple argument supporting our finding is that none of the CMIP6 models simulate a TCR lower than their forced warming in 2020 (usually by a large margin; fig. S9). So, our finding of a forced 2020 warming of 1.2°C in the real world could contribute to discarding low TCR values—the use of improved GSAT observations plays a key role here.

Historical observations are expected to constrain ECS less effectively, and any such constraint must be taken with caution—estimating ECS from observations is equivalent to an extrapolation far outside the range of observed climate states. However, deriving implications of our results for ECS remains of interest. Using CMIP6 models only, we find an ECS constrained range of 2.3 ° to 4.6°C, as compared to 2.1 ° to 6.1°C without the constraint. Consistent with other metrics, the model range is primarily revised downward, but our final constrained range excludes the lower end of the IPCC AR5–assessed likely range. We notice that our ECS range remains relatively wide, suggesting that observed GSAT changes do not constrain the long-term equilibrium (e.g., the ECS/TCR ratio) very well.

### Comparing CMIP6 and CMIP5

Applying the same observational constraint using CMIP5 models and Representative Concentration Pathways (RCPs), instead of CMIP6 models and SSPs, is also of high interest (Fig. 4 and fig. S7). On average, the unconstrained CMIP6 ensemble projects a higher warming than CMIP5, consistent with a higher sensitivity (*14*). The CMIP6 ensemble also exhibits much wider uncertainty ranges, leading to a lower bound on 2100 projected warming lower than that of CMIP5. After applying the observational constraint, ranges agree remarkably well in the near term (e.g., before 2040), suggesting that observations play a dominant role in this time frame. CMIP5- and CMIP6-constrained ranges also exhibit comparable width for all scenarios and periods considered. However, they diverge somewhat in the late 21st century: The projected warming in 2100 is 10 to 15% higher using CMIP6, with the largest relative difference for the lowest emission scenarios. This corresponds to a 0.3° to 0.5°C difference in 2100, depending on the scenario. A smaller discrepancy is found in the case of climate sensitivity metrics, typically 5 to 10%. The range for TCR based on CMIP5 models is narrower, with a lower upper bound. The range for ECS is shifted upward by about 0.2°C from CMIP5 to CMIP6.

There are various explanations to these discrepancies. Differences in scenarios (*26*) contribute to a more pronounced 21st century warming in CMIP6 relative to CMIP5. Although RCPs and SSPs are scenarios with nominally the same forcing level in 2100, the actual forcing levels are higher in SSPs (including a higher GHG forcing; fig. S11) (*28*, *29*). Discrepancies in emission scenario contribute to the large gap found in the 21st century warming but cannot explain the reported difference in terms of TCR and ECS. The latter suggests that CMIP6 models might exhibit a higher sensitivity than CMIP5 models, not only in terms of the raw values (*14*) but also if taken conditionally to historical observations. In particular, we notice that the constrained time series of past changes using the CMIP5 or CMIP6 ensembles agree remarkably well. So, these two ensembles provide different estimates of future changes while they are in perfect agreement in the past. Two phenomena can explain or contribute to this. First, the historical forcings used in CMIP5 and CMIP6 differ substantially (fig. S11). The most notable difference is found in the late 20th century, involves the aerosols forcing, which is more negative in CMIP6, and exhibits a different time series. A more negative aerosol forcing can offset a larger fraction of GHG-induced warming, making the observational record consistent with higher sensitivity. Poor representation of radiative forcings after 2005 in CMIP5 simulations was also identified as contributing to a discrepancy between models and observations in terms of the recent warming trend (*30*). Second, progress in the representation of atmospheric physics can affect how models respond to a given change in forcings. For instance, a higher ratio of ECS to TCR, nonlinearities in feedback or GHG forcing (*31*), could cause a different response to the same GHG emissions. Further research will be needed to quantify how these two terms (changes in radiative forcings versus changes internal to the models themselves) are contributing to the findings reported here. Last, the model sampling uncertainty (the number of models in both generations remains quite limited) could also contribute to the reported gap. This issue is further discussed below.

## DISCUSSION

Because of an improved statistical method, we are able to make use of the entire observational record to refine estimates of past and future climate change. Our results suggest that historical observations can now be used to reduce uncertainty in human-induced warming significantly: typically by a factor of 3 in the past and in the near term (i.e., by 2040) and a factor of 2 in the long term (late 21st century). In this context, using an unconstrained multimodel ensemble is no longer the best choice for global mean temperature projections. This is an important finding because few observational constraints were applied to climate projections until recently (*8*). Overall, our results suggest that we are now at the time when direct monitoring of climate change enables narrowing uncertainty on both past and future warming substantially, as was expected (*4*, *32*).

We also derive constrained estimates of attributable warming to date, future warming in response to a range of emission scenarios, and metrics of climate sensitivity, all of them consistently rejecting the lower end of previous estimates. These results have broad implications in terms of foreseeable climate change impacts and the need for urgent mitigation.

A recent study (*15*) applied an observational constraint to CMIP6 projections using a different statistical approach. Our results agree with this study in rejecting the highest sensitivities from the CMIP6 ensemble. However, we report a larger reduction in uncertainty, arising primarily from discrepancies in the lower bound of constrained ranges. This is particularly clear in terms of constrained TCR ranges: 0.90° to 2.27°C in (*15*) (considering the 1981–2014 period), as opposed to 1.33 ° to 2.36°C here (both ranges are 5 to 95%). More work will be needed to fully understand this discrepancy, but we hypothesize that incorporating information about the total forced warming to date (i.e., considering a longer observed record) could contribute to rejecting low TCR values. Another noticeable discrepancy involves the CMIP5 versus CMIP6 comparison: Our observational constraint suggests a slightly higher future warming using the latest model generation, while Tokarska *et al.* (*15*) report consistent results with these two ensembles.

One potential limitation of this work is that our prior distribution might not sample uncertainty comprehensively. A first obvious limitation is related to the number of models (*22*), which remains modest. This number could increase as the CMIP6 archive becomes more fully populated. However, there are also important limitations related to the design of the CMIP exercises. CMIP ensembles are typically ensembles of opportunity, i.e., they are not designed to sample uncertainty comprehensively (*33*). For instance, no models are being intentionally built to provide low or high values of TCR or ECS, and the relatively wide ranges found among CMIP6 models was unintentional. Forcing uncertainty is also poorly sampled in the CMIP6 ensemble. The magnitude of the aerosol forcing in 2014 differs substantially among models (*29*), but uncertainty on the historical emissions of aerosols or aerosols precursors and their time series is not sampled. Last, CMIP models exhibit dependencies among themselves and from one generation to the next (*34*), further reducing the effective model sample size. To some extent, this model uncertainty sampling issue could be alleviated by some improvement of the statistical approach, e.g., using regularized estimates of the model error covariance. A better approach probably requires using perturbed physics ensembles (PPEs) (*35*), with an inclusion of forcing uncertainty, to overcome this limitation. However, production of such coordinated multimodel PPEs remains a challenge for the climate science community. In the meantime, CMIP ensembles remain the best available multimodel dataset to quantify uncertainties about future climate change.

The introduction of the new statistical method proposed in this study also opens new possibilities. Unlike techniques based on energy balance models, which are only suitable at the global or hemispheric scales (*11*), our technique could be easily adapted to constrain past and future changes at the regional scale. Incorporating spatial information, variables other than surface temperature, or other types of constraints (e.g., physically based or paleoclimate constraints), is another promising line of research. Given that accurate climate projections with well-quantified uncertainties are essential for adaptation planning and mitigation policies, we expect the results of this study and similar applications on smaller scales to be of great value to a range of stakeholders.

## MATERIALS AND METHODS

### Data

*Observed data*. We use observations from Cowtan and Way (*16*), hereafter denoted HadCRUT4-CW. This dataset provides an extension of the HadCRUT4 dataset (*36*) with global coverage, thus alleviating issues related to missing data. From this dataset, we compute the global mean near-surface temperature (GMST), without applying any observational mask. Furthermore, this dataset uses a blending of SSTs over ocean and near-surface air temperature over land. Several studies have shown that this metric of global mean temperature warms significantly less than the GSAT (*17*, *18*). To take this discrepancy into account, we multiply the Cowtan and Way GMST by a factor of 1.06, corresponding to a 6% increase (*16*). This factor was estimated in historical runs specifically, while previous calculations made over the 21st century range between 1.05 and 1.08 (*17*, *18*). In addition to affecting the long-term trend, this correction inflates the year-to-year variability in the observed time series.

The HadCRUT4-CW dataset is provided with an estimate of measurement uncertainties, through a set of equally plausible realizations (*36*). This is a remarkable feature, as HadCRUT4-CW is the only dataset with global coverage providing an estimate of observational uncertainty—which is why we chose to use this dataset. In the following, we take the median of this dataset as a best estimate. All realizations are used to estimate the measurement uncertainty. Note that we do not consider other datasets for GMST because they usually fall within the uncertainty of HadCRUT4-CW (*1*); considering the entire HadCRUT4-CW ensemble allows a more comprehensive treatment of uncertainty.

*CMIP data*. We make use of a large set of models from CMIP6 and CMIP5 (*13*, *37*). We took all models providing at least 200 years of preindustrial control simulations, one historical simulation, one 2.6 W m^{−2} of scenario (either RCP2.6 or SSP1-2.6), one 4.5 W m^{−2} of scenario, one 8.5 W m^{−2} of scenario, and either a histGHG simulation [i.e., a historical simulation in which GHGs follow their historical concentrations, while other forcings are kept constant (*38*)], or a 1% CO_{2} simulation (in which the CO_{2} concentration increases by 1% each year). Only the *tas* variable, corresponding to near-surface atmospheric temperature, is used here.

We consider 22 models from the CMIP6 ensemble: ACCESS-CM2, AWI-CM-1-1-MR, BCC-CSM2-MR, CAMS-CSM1-0, CanESM5, CESM2, CESM2-WACCM, CNRM-CM6-1, CNRM-CM6-1-HR, CNRM-ESM2-1, GFDL-ESM4, GISS-E2-1-G, HadGEM3-GC31-LL, INM-CM4-8, IPSL-CM6A-LR, MIROC-ES2L, MIROC6, MPI-ESM1-2-HR, MRI-ESM2-0, NESM3, NorESM2-LM, and UKESM1-0-LL. We consider 17 models from the CMIP5 ensemble: bcc-csm1-1, CanESM2, CCSM4, CESM1-CAM5, CNRM-CM5, CSIRO-Mk3-6-0, GFDL-CM3, GISS-E2-H, GISS-E2-R, HadGEM2-ES, IPSL-CM5A-LR, MIROC-ESM, MIROC-ESM-CHEM, MPI-ESM-LR, MPI-ESM-MR, MRI-CGCM3, and NorESM1-M.

TCRs and ECS of CMIP6 models were taken from (*15*) and recalculated following the same methodology for ACCESS-CM2 and AWI-CM-1-1-MR. TCR and ECS of CMIP5 models come from table 9.5 in (*39*).

### Statistical method

*Constraining projections*. We first focus on the analysis of GSAT time series over the period 1850–2100, i.e., in response to historical forcings followed by a given emission scenario. We denote the real-world response to all forcings **x**. is a vector of size *n* = 251: the GSAT time series over the period 1850–2100 and under a given emission scenario. We denote the vector (time series) of GSAT observations **y**—these are available from 1850 to 2019 (i.e., **y** is of length *n _{y}* = 170).

The statistical approach works in two steps. First, we consider an ensemble of climate models to derive a (model-based) prior for **x**, denoted π(**x**). Second, we make use of observations **y** to derive the posterior *p*(**x**∣**y**).

To derive the prior π(**x**), we start by estimating the forced GSAT response for each CMIP model (using all available members) and each scenario considered. This is performed using a generalized additive model (GAM), in which the response to NAT forcing is a rescaling of the response of an energy balance model, while the response to ANT forcings is only assumed to be smooth in time (smoothing splines are used to filter out part of internal variability). The procedure is identical to that described in (*40*); details are given in the Supplementary Materials. Then, we consider an ensemble of these forced responses (e.g., all CMIP6 or CMIP5 models) and derive the prior π(**x**) ∼ *N*(**μ**, **Σ**_{mod}), assuming model-truth exchangeability (*20*, *41*, *42*), i.e., calculating **μ** and **Σ**_{mod} as the sample mean and variance-covariance matrix over models. In this framework, **Σ**_{mod} quantifies the model uncertainty on the forced response **x**. This prior π(**x**) can be interpreted as a first estimate of **x**, which makes no use of observations, and is based only on climate model simulations, with uncertainty based on model spread (*8*).

We further assume that observations **y** can be written as the sum of the forced response plus internal variability and measurement error, i.e.**H** is an observation operator and ε is a random term corresponding to measurement errors and internal variability. **H** basically extracts the part of **x** that is observed in **y**, i.e., the total forced response from 1850 to 2019. In the current application, **H** is a truncated *n _{y}* ×

*n*identity matrix, with 1 in the

*n*diagonal coefficients and 0 elsewhere.

_{y}Under this assumption, the posterior distribution *p*(**x**∣**y**) can be derived easily, as both π(**x**) and ε follow normal laws. Details about this technical step are given in the Supplementary Materials.

*Application to attribution*. We now describe extension of our statistical method to the attribution problem. To estimate attributable warming, the method is applied to a much longer vector **x**: **x** = (**x**^{all}; **x**^{nat}; **x**^{ghg}), where **x**^{all}, **x**^{nat}, and **x**^{ghg} describe the responses to ALL, NAT-only, or GHG-only forcings, respectively. **x**^{all} and **x**^{nat} both cover the period 1850–2100, but **x**^{ghg} is restricted to the period 1850–2020 [as we only consider `histGHG` simulations (*38*)]. Therefore, **x** has a total dimension of *n _{a}* = 251 + 251 + 171 = 673 in this case.

**H**has to be adapted accordingly and becomes a

*n*×

_{y}*n*matrix

_{a}**I**

*is a truncated identity matrix of size*

_{ny}*n*×

_{y}*n*. As observations have been made in the presence of all forcings,

**H**simply extracts values of

**x**

^{all}at observed dates—and the other components of

**x**are not “observed.” The remaining of the method is unchanged. The posterior distribution

*p*(

**x**∣

**y**) then provides information not only about the total forced response given

**y**but also about the responses to NAT and GHG forcings specifically, still given

**y**. Estimates of the responses to other combinations of forcings such as ANT or OA can be derived by subtraction. For instance, the prior and posterior distributions of

**x**

^{ant}≔

**x**

^{all}−

**x**

^{nat}can be easily deduced from those of the full vector

**x**= (

**x**

^{all};

**x**

^{nat};

**x**

^{ghg}). The proposed technique is very similar, although not strictly identical, to that introduced in (

*20*) (further discussion in the Supplementary Materials).

Deriving the prior π(**x**), in the case of attribution, requires an estimate of the responses of each climate model considered to the NAT and GHG forcings, specifically. The NAT component is taken from the GAM decomposition mentioned above. Estimation of the GHG component requires a specific analysis. For some CMIP6 models, histGHG simulations are available and cover the full 1850–2020 period; in this case, **x**^{ghg} is estimated from these simulations, using a basic smoothing procedure. For other models, the histGHG simulation do not cover the entire 1850–2020 period, and we use an adaptation of our Kriging technique to produce a statistical completion of that simulation over 1850–2020. Last, some models did not perform any histGHG simulation, and then we use another adaptation of our Kriging technique to reconstruct them statistically, using 1% CO_{2} and an extended historical run as inputs. Details regarding estimation of **x**^{ghg} are given in the Supplementary Materials. Ignoring reconstructed **x**^{ghg} responses in the attribution analysis leads to slightly narrower confidence ranges (fig. S8). This suggests that the main analysis shown in Fig. 1 is conservative.

*Application to TCR and ECS*. Another extension of our statistical technique concerns metrics of climate sensitivity, i.e., TCR and ECS. A key difficulty, in this case, is that there are several ways to relate TCR and ECS to the constrained temperature changes. The method used to produce Fig. 2 involves applying the observational constraint to a vector*F* is the radiative forcing corresponding to a doubling of CO_{2} concentration, λ is the feedback parameter, and ECS*43*). Calculating ECS that way means that the feedback parameter λ is assumed constant; therefore, we consider an effective climate sensitivity rather than the real long-term ECS (*44*). As for other applications, this vector **x** is assumed to follow a Gaussian prior. This formulation ensures that λ is negative (a desirable property). However, there are other valid options to relate ECS to changes in **x**^{all} (see discussion in the Supplementary Materials). Among the three options tested, constrained ranges for ECS agree reasonably well (fig. S10).

*Discussion of the method and estimation of input parameters*. Applying the observational constraint this way offers many advantages. The model prior captures various facets of model uncertainty in a consistent way. This includes uncertainty in both the magnitude and the temporal pattern of the forced response, unlike usual D&A methods (*6*, *7*, *10*, *19*, *20*, *27*). Using the entire observed dataset avoids the arbitrary choice of a reference period (as performed, e.g., for historical trends), facilitates annual updates, and helps distinguish the responses to GHGs from OA factors (thanks to the different emission time series for these two forcings). In addition, the entire time series of constrained projections is derived in a consistent way, which is particularly attractive for near-term projections.

Our procedure can be considered as an adaptation of Kriging or Gaussian process regression (*45*, *46*)—a method developed to interpolate geophysical data based on prior covariances. Here, this technique is used to extrapolate past changes into the future, using climate models as a learning sample. One noticeable difference with a standard use of Kriging resides in the prior, which, in our case, is nonstationary and derived from physically based climate models. In this statistical procedure, one could also identify all components of a Kalman filtering procedure, as used in data assimilation and weather forecasting systems: a model or background error covariance **Σ**_{mod}, observation error covariance **Σ**_{obs}, and the observation operator **H**. However, unlike in data assimilation, there is no iteration—our procedure could be considered as a one-step filtering.

Overall, application of this technique requires to specify what **x** and **y** are and how the key components **μ**, **Σ**_{mod}, and **Σ**_{obs} are estimated. In practice, **μ** and **Σ**_{mod} are derived as the sample mean and variance-covariance matrix of an ensemble of climate model responses. Given the limited number of available models, **Σ**_{mod} is highly degenerate. **Σ**_{obs} is a covariance matrix describing observational uncertainty. We assume that **Σ**_{obs} = **Σ**_{meas.} + **Σ**_{i. v.}, where **Σ**_{meas.} is the measurement error, while **Σ**_{i. v.} describes internal variability. **Σ**_{meas.} is simply estimated as the sample variance-covariance matrix of the HadCRUT4-CW ensemble. Estimation of **Σ**_{i. v.} is based on a mixture of two autoregressive processes of order 1 (AR1), with different time scales. We assume that internal variability ε_{t} writes_{slow} > α_{fast}. To estimate the coefficients (**y**) and then apply a maximum likelihood procedure to these residuals. Uncertainty on these estimated coefficients is not taken into account. Our estimate of Σ_{i.v.} is consistent with previous studies reporting that GSAT internal variability exhibits more dependence than AR1 but less than a long-range memory effect (*47*, *48*). Further discussion and justification for this choice are provided in the Supplementary Materials.

## SUPPLEMENTARY MATERIALS

Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/7/4/eabc0671/DC1

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is **not** for commercial advantage and provided the original work is properly cited.

## REFERENCES AND NOTES

**Acknowledgments:**We acknowledge insightful comments and constructive criticisms from three reviewers, which highly improved the quality of this manuscript. We are very grateful to M. Stolpe and K. Tokarska for sharing estimates of climate sensitivity in CMIP6 models and discussions related to their recent work. We thank O. Geoffroy and D. Saint-Martin for insightful discussions on climate sensitivity and emerging constraints. We thank the climate modeling groups involved in CMIP5 and CMIP6 exercises for producing and making available their simulations.

**Funding:**A.R. and S.Q. acknowledge support by Météo-France, CNRS, the European Union’s Horizon 2020 Research and Innovation Program under the EUCP (grant agreement 776613) and the CONSTRAIN (grant agreement 820829) projects; the EUPHEME project (which is part of ERA4CS, an ERA-NET initiated by JPI Climate and cofunded by the European Union, grant agreement 690462); and the French Ministère de la Transition Écologique et Solidaire, under the Convention pour les Services Climatiques. N.P.G. is supported by Environment and Climate Change Canada.

**Author contributions:**A.R. designed the study and produced most figures. S.Q. processed the data and contributed to several analyses and figures. All authors contributed to interpreting the results, designing new sensitivity analyses, and writing of the manuscript.

**Competing interests:**The authors declare that they have no competing interests.

**Data and material availability:**The HadCRUT4-CW dataset is available at www-users.york.ac.uk/~kdc3/papers/coverage2013/series.html. The CMIP datasets are available at https://esgf-node.llnl.gov/projects/esgf-llnl/. Effective radiative forcing (ERF) time series corresponding to the CMIP6 exercise (

*50*) are available at https://doi.org/10.5281/zenodo.3515338. ERF time series corresponding to the CMIP5 exercise are available at www.pik-potsdam.de/~mmalte/rcps/. The figures and analyses were produced with the R software (available at https://cran.r-project.org). R code used to perform this analysis is available from the corresponding author upon request. A simplified version producing constrained ranges for one SSP scenario is available on the corresponding author’s personal webpage. All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.

- Copyright © 2021 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC).