Research ArticleENVIRONMENTAL STUDIES

Extensive arsenic contamination in high-pH unconfined aquifers in the Indus Valley

See allHide authors and affiliations

Science Advances  23 Aug 2017:
Vol. 3, no. 8, e1700935
DOI: 10.1126/sciadv.1700935
  • Fig. 1 Arsenic concentrations measured in Pakistan groundwater.

    Arsenic exceeds the WHO guideline of 10 μg/liter in large parts of the Indus plain. The green to brown coloring illustrates the topography. The Indus River and its major tributaries as well as the major cities are indicated. The samples were collected for this study (n = 1184) between 2013 and 2015.

  • Fig. 2 Statistics of the classification strength of the logistic regression analysis results using the threshold of 10 μg/liter (WHO As guideline) applied to the entire set of 743 aggregated data points.

    (A) ROC curve with an AUC of 0.80, which indicates the discriminative power of the logistic equation. (B) Sensitivity (true-positive rate), accuracy, and specificity (true-negative rate) versus cutoff value.

  • Fig. 3 Arsenic prediction and risk models.

    (A) Probability (hazard) map of the occurrence of arsenic concentrations in groundwater exceeding the WHO As guideline of 10 μg/liter along with the aggregated arsenic data points used in modeling (n = 743) (see fig. S3B for the hazard map using 50 μg/liter). (B) Density of population at risk of high levels of arsenic in groundwater using the WHO As guideline of 10 μg/liter. Figure was based on 2016 population figures and a 60 to 70% groundwater utilization rate (see text). The estimated number of people potentially affected is ~50 million to 60 million, with hot spots around Lahore and Hyderabad.

  • Fig. 4 Indicators of geochemical environment.

    (A) Measured iron concentrations (n = 458). (B) Soil organic carbon (70) used as a predictor variable. (C) Correlation of groundwater samples exceeding the WHO As guideline of 10 μg/liter against soil pH (in 11 bins; see Materials and Methods). (D) Soil pH (70) used as a predictor variable, shown with the arsenic measurements above and below 10 μg/liter.

  • Table 1 Comparison of AUC and AIC results of logistic regression analyses of models with fixed variables (models A to E) along with the final model (model F) shown in Fig. 3A, which was achieved by stepwise variable selection.

    A higher AUC shows better model prediction performance, whereas a lower AIC is indicative of a simpler, more effective model. The associated hazard maps are provided in fig. S4.

    ModelPredictor variable(s)AUCAIC
    AFluvisols (probability)0.69787 ± 5
    BIrrigated area0.72739 ± 7
    CAridity, slope (binary, 0.1°)0.73716 ± 8
    DSlope (binary, 0.1°), soil pH0.77688 ± 9
    EAridity, Holocene fluvial sediments
    (binary), slope (binary, 0.1°)
    0.77664 ± 8
    FFluvisols (probability), Holocene fluvial
    sediments (binary), soil organic
    carbon, soil pH, slope (binary, 0.1°)
    0.80644 ± 9
  • Table 2 Data sets evaluated for use as predictor variables in the logistic regression analysis.

    Correlations were found using the percentage of measurements exceeding 10 μg/liter in 11 bins of equal member-size across the range of each variable (see Materials and Methods). P values of logistic regression are based on univariate analyses. An asterisk indicates data sets that were not significant and therefore removed from the full logistic regression analysis. n/a, not applicable.

    Data setResolutionCorrelation
    (P)
    Logistic
    regression (P)
    Potential evapotranspiration
    (PET) (72, 73)
    30″0.730 (<0.05)<0.05
    Precipitation (74)30″−0.776 (<0.05)<0.05
    Aridity [precipitation (74)/
    PET (72, 73)]
    30″−0.779 (<0.05)<0.05
    Irrigated area % (75)5’0.967 (<0.05)<0.05
    Slope (binary, 0.1°) (76)30″n/a<0.05
    Fluvisol probability (%)
    (70, 77, 78)
    30″0.704 (<0.05)<0.05
    Soil organic carbon
    (70, 77, 78)
    30″−0.778 (<0.05)<0.05
    Soil pH (70, 77, 78)30″0.977 (<0.05)<0.05
    Soil clay % (70)*30″−0.338 (>0.05)>0.05
    Soil silt % (70)*30″−7.22 × 10−2 (>0.05)>0.05
    Holocene fluvial sediments
    (binary) (69)
    Polygonn/a<0.05

    *Data sets that were not significant and therefore removed from the full logistic regression analysis.

    Supplementary Materials

    • Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/3/8/e1700935/DC1

      fig. S1. Maps of spatial distribution and values of all chemical parameters.

      fig. S2. Grids of all measured chemical parameters.

      fig. S3. Hazard maps of logistic regression models using thresholds of 10 and 50 μg/liter.

      fig. S4. Maps of other well-fitting logistic regression models.

      fig. S5. Statistics of logistic regression model using threshold of 50 μg/liter.

      fig. S6. Density of population at risk using logistic regression models with thresholds of 10 and 50 μg/liter.

      fig. S7. Predictor data sets used in the final model.

      fig. S8. Graphs of correlation between predictor variables and percentage of As measurements >10 μg/liter.

      table S1. Summary statistics of all measured parameters.

      table S2. Coefficients, SDs, and frequencies of predictor variables in logistic regressions with a threshold of 10 μg/liter.

      table S3. Coefficients, SDs, and frequencies of predictor variables in logistic regressions with a threshold of 50 μg/liter.

    • Supplementary Materials

      This PDF file includes:

      • fig. S1. Maps of spatial distribution and values of all chemical parameters.
      • fig. S2. Grids of all measured chemical parameters.
      • fig. S3. Hazard maps of logistic regression models using thresholds of 10 and 50 μg/liter.
      • fig. S4. Maps of other well-fitting logistic regression models.
      • fig. S5. Statistics of logistic regression model using threshold of 50 μg/liter.
      • fig. S6. Density of population at risk using logistic regression models with thresholds of 10 and 50 μg/liter.
      • fig. S7. Predictor data sets used in the final model.
      • fig. S8. Graphs of correlation between predictor variables and percentage of As measurements >10 μg/liter.
      • table S1. Summary statistics of all measured parameters.
      • table S2. Coefficients, SDs, and frequencies of predictor variables in logistic regressions with a threshold of 10 μg/liter.
      • table S3. Coefficients, SDs, and frequencies of predictor variables in logistic regressions with a threshold of 50 μg/liter.

      Download PDF

      Files in this Data Supplement: