Research ArticleRESEARCH METHODS

Neyman-Pearson classification algorithms and NP receiver operating characteristics

See allHide authors and affiliations

Science Advances  02 Feb 2018:
Vol. 4, no. 2, eaao1659
DOI: 10.1126/sciadv.aao1659
  • Fig. 1 Classical versus NP oracle classifiers in a binary classification example.

    In this toy example, the two classes have equal marginal probabilities, that is, Embedded Image. Suppose that a user prefers a type I error of ≤0.05. The classical classifier I(X > 1) that minimizes the risk would result in a type I error of 0.16. On the other hand, the NP classifier I(X > 1.65) that minimizes the type II error under the type I error constraint (α = 0.05) delivers a desirable type I error. This figure is adapted from Tong et al. (17) and Li and Tong (52), with permissions.

  • Fig. 2 Choose a threshold such that the type I error is below α with high probability.

    (A) The naïve approach. (B) The fivefold CV–based approach. (C) The NP approach. D NP classifies, based on D = 1000 data sets, shown as overlapping red and teal “x” (A and B, respectively) and blue “+” (C) on the curves.

  • Fig. 3 Illustration of choosing NP classifiers from empirical ROC curves and NP-ROC lower curves in Simulation S1 (see the Supplementary Materials).

    (A) Distributions of empirical type I errors and population type I errors of 1000 classifiers, with each classifier chosen from one empirical ROC curve corresponding to the largest empirical type I error no greater than 0.05. (B) Distributions of empirical type I errors and population type I errors of 1000 classifiers, with each classifier chosen from one NP-ROC lower curve (δ = 0.05) corresponding to α = 0.05.

  • Fig. 4 Illustration of NP-ROC bands.

    (A) How to draw an NP-ROC band. Each blue dashed line represents one NP classifier, with horizontal coordinate α and vertical coordinates 1 − βU (lower) and 1 − βL (upper). Right-continuous and left-continuous step functions are used to interpolate points on the upper and lower ends, respectively. (B) Use of NP-ROC bands to compare the two LDA methods in Simulation 2.

  • Fig. 5 NP-ROC bands and ROC-CV curves (generated by fivefold CV) of three classification methods in real data application 1.

    (A) NP-ROC bands of RFs versus penLR versus SVMs. RF dominates the other two methods for a wide range of α values. (B) ROC-CV curves of RF, penLR, and SVM. Among the three methods, RF has the largest area under the curve. (C) NP-ROC band and ROC-CV curve of RF. The black dashed vertical line marks α = 0.21, the smallest α such that the conditional type II error is no greater than 0.4. The green dashed vertical line marks α = 0.249, the value that maximizes the vertical distance from the lower curve to the diagonal line, a criterion motivated by the Youden index (39). (D) NP-ROC band and ROC-CV curve of SVM.

  • Fig. 6 NP-ROC bands of three classification methods in real data application 2.

    (A) penLR versus RFs. No method dominates the other for any α values. (B) penLR versus NB. The black bar at the bottom indicates the α values where penLR is better than NB with high probability.

  • Algorithm 1. An NP umbrella algorithm
    1: Input:
      Training data: A mixed i.i.d. sample Embedded Image, where Embedded Image and Embedded Image are class 0 and 1 samples, respectively
      α: Type I error upper bound, 0 ≤ α ≤ 1; (default α = 0.05)
      δ: A small tolerance level, 0 < δ < 1; (default δ = 0.05)
      M: Number of random splits on Embedded Image; (default M = 1)
    2: Function RankThreshold(n, α, δ)
    3: For k in {1, ⋯, n} do◃ For each rank threshold candidate k
    4: Embedded Image◃ Calculate the violation rate upper bound
    5: k* ← min{k ∈ {1, ⋯, n} : v(k) ≤ δ}◃ Pick the rank threshold
    6: Return k*
    7: Procedure NPClassifier(Embedded Image)
    8: Embedded Image◃ Denote half of the size of Embedded Image as n
    9: k* ← RankThreshold(n, α, δ)◃ Find the rank threshold
    10: For i in {1, ⋯, M} do◃ Randomly split Embedded Image for M times
    11: Embedded Image random split on Embedded Image◃ Each time randomly split Embedded Image into two halves with equal sizes
    12: Embedded Image◃ Combine Embedded Image and Embedded Image
    13: Embedded Image◃ Write Embedded Image as a set of n data points
    14: fi ← ClassificationAlgorithmEmbedded Image◃ Train a scoring function fi on Embedded Image
    15: Embedded Image◃ Apply the scoring function fi to Embedded Image to obtain a set of score threshold candidates
    16: {ti,(1), ⋯, ti,(n)} ← sortEmbedded Image◃ Sort elements of Embedded Image in an increasing order
    17: Embedded Image◃ Find the score threshold corresponding to the rank threshold k*
    18: Embedded Image◃ Construct an NP classifier based on the scoring function fi and the threshold Embedded Image
    19: Output: an ensemble NP classifier Embedded Image◃ By majority vote

Supplementary Materials

  • Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/4/2/eaao1659/DC1

    Proof of Proposition 1

    Conditional type II error bounds in NP-ROC bands

    Empirical ROC curves versus NP-ROC bands in guiding users to choose classifiers to satisfy type I error control

    Effects of majority voting on the type I and II errors of the ensemble classifier

    table S1. Results of LR in Simulation S2.

    table S2. Results of SVMs in Simulation S2.

    table S3. Results of RFs in Simulation S2.

    table S4. Results of NB in Simulation S2.

    table S5. Results of LDA in Simulation S2.

    table S6. Results of AdaBoost in Simulation S2.

    table S7. Description of variables used in real data application 1.

    table S8. The performance of the NP umbrella algorithm in real data application 2.

    table S9. Input information of the nproc package (version 2.0.9).

    R codes and data sets

  • Supplementary Materials

    This PDF file includes:

    • Proof of Proposition 1
    • Conditional type II error bounds in NP-ROC bands
    • Empirical ROC curves versus NP-ROC bands in guiding users to choose classifiers to satisfy type I error control
    • Effects of majority voting on the type I and II errors of the ensemble classifier
    • table S1. Results of LR in Simulation S2.
    • table S2. Results of SVMs in Simulation S2.
    • table S3. Results of RFs in Simulation S2.
    • table S4. Results of NB in Simulation S2.
    • table S5. Results of LDA in Simulation S2.
    • table S6. Results of AdaBoost in Simulation S2.
    • table S7. Description of variables used in real data application 1.
    • table S8. The performance of the NP umbrella algorithm in real data application 2.
    • table S9. Input information of the nproc package (version 2.0.9).

    Download PDF

    Other Supplementary Material for this manuscript includes the following:

    Files in this Data Supplement:

Navigate This Article