## Abstract

Genetic studies of several marine species with high fecundity have produced “tiny” estimates (≤10^{−3}) of the ratio of effective population size (*N*_{e}) to adult census size (*N*), suggesting that even very large populations might be at genetic risk. A recent study using close-kin mark-recapture methods estimated adult abundance at *N* ≈ 2 × 10^{6} for southern bluefin tuna (SBT), a highly fecund top predator that supports a lucrative (~$1 billion/year) fishery. We used the same genetic and life history data (almost 13,000 fish collected over 5 years) to generate genetic and demographic estimates of *N*_{e} per generation and *N*_{b} (effective number of breeders) per year and the *N*_{e}/*N* ratio. Demographic estimates, which accounted for age-specific vital rates, skip breeding, variation in fecundity at age, and persistent individual differences in reproductive success, suggest that *N*_{e}/*N* is >0.1 and perhaps about 0.5. The genetic estimates supported this conclusion. Simulations using true *N*_{e} = 5 × 10^{5} (*N*_{e}/*N* = 0.25) produced results statistically consistent with the empirical genetic estimates, whereas simulations using *N*_{e} = 2 × 10^{4} (*N*_{e}/*N* = 0.01) did not. Our results show that robust estimates of *N*_{e} and *N*_{e}/*N* can be obtained for large populations, provided sufficiently large numbers of individuals and genetic markers are used and temporal replication (here, 5 years of adult and juvenile samples) is sufficient to provide a distribution of estimates. The high estimated *N*_{e}/*N* ratio in SBT is encouraging and suggests that the species will not be compromised by a lack of genetic diversity in responding to environmental change and harvest.

## INTRODUCTION

Effective population size (*N*_{e}) is the evolutionary analog of census size (*N*). Whereas census size strongly influences rates of demographic/ecological processes like competition, predation, and population growth rates, *N*_{e} controls the rates of inbreeding, genetic drift, and loss of genetic diversity and influences effectiveness of natural selection (*1*). Because effective size is challenging to estimate in natural populations, considerable interest has focused on the ratio of effective size to census size, *N*_{e}/*N* (*2*, *3*). If specific *N*_{e}/*N* values consistently apply to certain classes of species, then information about abundance could be used to predict rates of evolutionary processes at the population level. However, although considerable theoretical and empirical efforts have been made to find relationships between *N*_{e}/*N* and a variety of life history features (*4*–*7*), a completely general framework remains elusive.

Two factors in particular make this challenging. First, a number of studies have found a negative correlation between *N*_{e}/*N* and *N*. Possible explanations for this pattern include reduced variance in reproductive success at low abundance (*8*, *9*), increased adult mortality that reduces *N* more than *N*_{e} (*10*), and density-dependent effects that increase offspring survival and generation length at low density (*11*). However, the major point of controversy about *N*_{e}/*N* ratios involves the hypothesis of sweepstakes reproductive success (SRS) in marine species (*12*, *13*), which postulates that *N*_{e}/*N* could be tiny (≤10^{−3}) in species with high fecundity, if only a few families “win the sweepstakes” by producing offspring that survive to reproduce. Numerous estimates of *N*_{e}/*N* in the range 10^{−3} to 10^{−6} have been published for high-fecundity marine species (*13*, *14*). If the *N*_{e}/*N* ratio is this small, it is important to know because even very large populations (*N* ≥ 10^{6}) could be at genetic risk. Some authors (*15*) argue that these concerns potentially apply quite generally to marine species with high fecundity.

Here, we evaluate the possibility of a tiny *N*_{e}/*N* ratio in southern bluefin tuna (SBT; *Thunnus maccoyii*), a large, mobile, top marine predator with broadcast spawning and high batch fecundity (>6 × 10^{6})—life history traits that are often associated with SRS. SBT support an important international fishery, the value of which has been estimated at >$1 billion/year; the species is also generally considered highly depleted, but traditional methods (such as traditional mark-recapture) have not been able to provide reliable estimates of stock status or recovery probability (*16*). For these reasons (high value + high uncertainty), SBT were targeted for the first large-scale application of close-kin mark-recapture (CKMR) methods to estimate abundance (*16*). This project sampled almost 13,000 juveniles and adults over 5 years (Table 1) and produced an estimate of adult population size (*N* ≈ 2 × 10^{6}) that was both higher and more precise [coefficient of variation (CV) = 0.17] than previous methods allowed. With this estimate of census size as a reference point, we used the extensive array of CKMR samples, together with life history information for SBT, to generate both genetic and demographic estimates of effective size, as well as the *N*_{e}/*N* ratio. We also simulated genetic data for two effective sizes corresponding to *N*_{e}/*N* ratios of 0.25 and 0.01 to compare with the empirical genetic estimates. Results of our demographic and genetic analyses are congruent and reject the possibility of tiny *N*_{e}/*N* in SBT. Instead, our results suggest that *N*_{e}/*N* is > 0.1 and perhaps about 0.5.

## RESULTS

### Population demography

The demographic model for SBT uses discrete time periods in years, indexed by *x*. Reproduction occurs in Indonesia and follows the seasonal birth-pulse model (*17*). At age *x*, each individual produces an average of *b*_{x} offspring and then survives to age *x* + 1 with probability *s*_{x}. Available data were sufficient only for a single estimate of constant adult survival for both sexes [*s*_{x} = 0.77/year; (*16*)]. Age at maturity (α) was fixed at age 8, maximum age was set to 30, and population size was *N* = 1.97 × 10^{6} adults [(*16*); see Materials and Methods]. Age-specific vital rates for SBT used in the analyses below are shown in table S1.

Compared to some other marine ectotherms, male SBT have a less pronounced increase in fecundity with age, while females show a strongly sigmoidal pattern owing to increases in per-kilo reproductive success of older females (Fig. 1). The combination of delayed maturity and increasing fecundity with age leads to a relatively long generation time (*T* = mean age of parents = 13.3 years). Because of the steeper increase in fecundity with age, generation length is longer for females (14.7 years) than for males (11.9 years).

### Demographic estimates of effective size

For iteroparous species like SBT, it is important to distinguish two kinds of effective size: *N*_{e} is the effective population size per generation, and *N*_{b} is the effective number of breeders per year or breeding cycle. *N*_{e} determines the rates of evolutionary processes (such as genetic drift and loss of genetic variability) in the population as a whole, but *N*_{b} is often easier to estimate and provides important insights into the breeding system and reproductive dynamics.

We used the program AgeNe (*18*) to produce demographic estimates of *N*_{e} and *N*_{b} based on SBT vital rates. In addition to age-specific estimates of survival and fecundity, calculation of *N*_{e} requires a third vital rate: ϕ_{x} = *V*_{k(x)}/*b*_{x}, which is the ratio of the variance to the mean reproductive success in one time period for individuals of age *x*. Random variation in reproductive success among same-age, same-sex individuals produces ϕ_{x} ≈ 1. We calculated *N*_{e} per generation using Hill’s (*19*) formula(1)where *N*_{1} is the number of offspring produced each time period that reach age at recruitment, and *V*_{k•} is lifetime variance in reproductive success (production of offspring) among the *N*_{1} individuals in a cohort. For SBT, *N*_{1} = 4.54 × 10^{5} recruits/year produces an adult population size of 1.97 × 10^{6}, which is similar to the estimate from the CKMR study (*16*). *N*_{b} is calculated using the standard discrete-generation formula for inbreeding effective size (*20*)(2)where *N* is adult census size and *V*_{k} and are computed across both sexes on the basis of offspring produced in a single reproductive cycle. In contrast to the age-specific values of ϕ_{x} = *V*_{k(x)}/*b*_{x} described above, *V*_{k} and in Eq. 2 apply to individuals of all ages that reproduce in one time period.

Empirical data for ϕ_{x} are rarely available for natural populations, but for SBT, we were able to use empirical data on variance of fecundity at age (table S2) to estimate ϕ_{x}. This analysis considers a variation of the original Wright-Fisher model in which individuals contribute unequally to a very large gamete pool from which the next cohort of offspring is randomly drawn. Our analyses showed that, assuming random survival of fertilized eggs until recruitment, the observed variance in fecundity at age would have only a trivial effect (ϕ_{x} < 1.02 for all ages in both males and females; table S2). These should be considered minimal estimates of ϕ_{x} because they only account for effects of body size and not other factors (for example, behavior and physiology) that can affect reproductive output.

A useful point of reference for evaluating the *N*_{e}/*N* ratio is an “ideal” Wright-Fisher population with discrete generations, in which *N*_{e} = *N*_{b} = *N* (ideal scenario in Fig. 2). The other scenarios in this figure all use the SBT vital rates in table S1, with modifications or additions as noted on the figure, and scale population size to produce a constant *N* = 1.97 × 10^{6} adults age 8 and older. Using the empirical SBT vital rates, including the minimum estimates of ϕ_{x}, initial demographic estimates of effective size are 1.34 × 10^{6} for *N*_{b} and 1.86 × 10^{6} for *N*_{e}, with the latter producing a high estimated *N*_{e}/*N* ratio of 0.93 (Fig. 2). This estimate, which is not much below the 1.0 ratio for an ideal population, reflects combined effects of three major life history traits.

First, discrete-generation populations reproduce in a single year or season, whereas SBT have adult life spans of up to 23 years. In the “α = 1, fixed *b*_{x}” scenario in Fig. 2, the population matures at age 1 and fecundity is invariant with age (both as in the discrete generation model), but annual adult survival, adult life span, and ϕ_{x} use the SBT values from table S1. Under these conditions, *N*_{e}/*N* drops to 0.56. This decline reflects the fact that, in iteroparous species, by chance some individuals live longer than others and reproduce more times; this increases lifetime variance in reproductive success (*V*_{kx}) and reduces *N*_{e}/*N* compared to the discrete-generation model, where all individuals die at the same age. Results for the α = 1, fixed *b*_{x} scenario are consistent with previous work (*4*, *6*), which has shown that, with constant vital rates, α = 1, and all ϕ = 1, *N*_{e}/*N* converges on 0.5 as adult life span increases. Alternative scenarios that used higher or lower estimates of adult survival did not appreciably affect the results (Supplementary Methods; table S3).

The second major factor that reduces *N*_{e}/*N* in SBT is age-related changes in fecundity. When fecundity increases with age, as it does in both sexes in SBT (Fig. 1), individuals that live longer not only get to reproduce more often, their output also goes up each year. This further exacerbates disparities in lifetime reproductive success and reduces *N*_{e}/*N*. In the “α = 1” scenario, adding the SBT vital rates for fecundity further dropped *N*_{e}/*N* to 0.44, an additional decline of 22%.

Compensating for these two negative factors is a third factor that substantially increases *N*_{e}/*N*: SBT do not mature until age 8. Delayed maturity increases generation length, and because *T* appears in the numerator of Eq. 1, *N*_{e} also increases proportionally. The only difference between the SBT and the α = 1 scenarios is the delayed age at maturity in SBT, which more than doubles *N*_{e}/*N* (from 0.44 to 0.93). Compared to these major life history effects, the very slight increase in ϕ based on empirical fecundity at age data reduces *N*_{e} only trivially (0.1%) from what it would be under random reproductive success (compare results for “ϕ = 1” and SBT scenarios in Fig. 2).

Because the minimum estimates of ϕ do not capture all factors that can influence variation in reproductive success—including family-correlated survivals that could be important for highly fecund species—we also estimated *N*_{e} and *N*_{b} assuming higher fixed values of ϕ ranging from 10 to 10^{4}. *N*_{e}/*N* drops to 0.39 for ϕ = 10 and to 0.11 for ϕ = 50 (Fig. 2), and subsequently drops by an order of magnitude for each order of magnitude increase in ϕ (fig. S1). Only for the extreme scenario with ϕ = 10^{4} does the estimated effective size/census size ratio drop to the range of tiny estimates that have appeared in the literature [*N*_{e}/*N* ≤ 10^{−3}; (*14*, *21*)]. ϕ = 10^{4} requires that only 1 in about 10^{3} to 10^{4} adults successfully reproduces each year (*21*). *N*_{b} is more sensitive than *N*_{e} to high values of ϕ; whereas increasing ϕ from near 1 to 10 drops *N*_{e}/*N* by 58%, it causes a decline of 93% in *N*_{b}/*N*, to 0.05.

Other important factors that can influence both *N*_{e} and *N*_{b} are skip breeding and persistent individual differences in reproductive success. Assuming that only half of SBT aged 8 to 12 are available to spawn in a given year (see Materials and Methods), annual *N*_{b} would be reduced by about 31% to 9.0 × 10^{5} (skip scenario; Fig. 2 and table S3). This reduction is fairly large because young fish make up a substantial fraction of the adult population. Skip breeding has a proportionally smaller effect on *N*_{b} if ϕ is assumed to be higher (the reduction is only about 4% for ϕ = 10; table S3). Conversely, *N*_{e} per generation increases with skip breeding because it helps ensure that different parents get to reproduce in different years. Because only part of the adult population is affected, we estimated the increase in *N*_{e} at only 4%, which is at the lower end of the range reported for species with high fecundity and high juvenile mortality (*22*).

If some individuals are consistently (across multiple time periods) good or bad at producing offspring, this consistency has no effect on *N*_{b} per year but reduces *N*_{e} per generation, because it increases variation among individuals in lifetime reproductive success. When we modeled persistent differences in reproductive success by assuming they scaled directly with empirical estimates of the CV of fecundity at age, we found a negligible (1%) effect on *N*_{e} (compare SBT and persist scenarios in Fig. 2). However, if the CV of fecundity at age were assumed to be five times as large as the empirical estimate (leading to a modest increase in ϕ to 1.1 to 1.4), *N*_{e} would decline by 18% (“persist, ϕ > 1.1” scenario). This shows that fairly small increases in ϕ, if they are persistent over time, can substantially influence effective population size.

### Genetic estimates of effective size

The 25 microsatellite loci used here were developed specifically for SBT to determine parentage among a set of adult and juveniles sampled from the wild fishery (*16*, *23*). The loci were chosen to be highly polymorphic and easily scored. The genetic estimates of effective size reflect quantitative adjustments to account for both physical linkage and effects of age structure. Results reported here include sensitivity analyses of effects of removing two loci with estimated frequencies of null alleles >10% and use of two criteria for screening out rare alleles.

Estimates of yearly *N*_{b}. The 10 adjusted point estimates of *N*_{b} from yearly juvenile samples were all >10^{4}, and 3 were infinitely large [indicating that all of the observed linkage disequilibrium (LD) could be explained by sampling error; Figs. 3 and 4]. Lower bounds for all 95% confidence intervals (CIs) were also >10^{4}. The combined (across years) estimates were 7.9 × 10^{4} to 1.2 × 10^{5}, which produce estimates of the *N*_{b}/*N* ratio of 0.04 to 0.06 (table S3).

Estimates of generational *N*_{e}. Adjusted point estimates of *N*_{e} for 2006 and 2007 were between 10^{4} and 10^{5}, while those for the last 3 years (2008–2010) were all infinite (Fig. 3). The 2006 sample included only 212 adults (Table 1; more than 1100 were sampled every other year), and the 2006 adult *N*_{e} estimates were also the only ones for which the lower bound of the 95% CIs dipped below 10^{4}. Combined estimates computed across all 5 years of adult samples were infinitely large, with a lower 95% confidence bound of ~2 × 10^{4}. The annual point estimates of *N*_{e} shown in Fig. 3 translate into *N*_{e}/*N* values that were either indeterminate (when estimated *N*_{e} = ∞) or fell in the range 0.01 to 0.1.

Simulated data. The two simulated scenarios used amounts of data comparable to those available for SBT, with effective sizes of *N*_{e} = 5 × 10^{5} (close to the demographic estimate assuming ϕ = 10) and *N*_{e} = 2 × 10^{4} (which would produce an *N*_{e}/*N* ratio of 0.01—a reduction of 99% from the adult census size). The distribution of simulated estimates for *N*_{e} = 5 × 10^{5} closely resembled, and were not statistically different from (*P* > 0.1), the empirical estimates of effective size in SBT: No point estimates were below 10^{4}, most finite estimates fell between 10^{4} and 10^{6}, and a substantial fraction of estimates were infinitely large (Fig. 4). In contrast, when the smaller *N*_{e} was simulated, 100% of the point estimates fell between 11,000 and 67,000, and this tight distribution was statistically incompatible with empirical data for both *N*_{b} and *N*_{e} (*P* < 10^{−6}). Thus, if true *N*_{e} were as small as 2 × 10^{4}, we would not expect to see any appreciable fraction of infinite estimates—which is at odds with the pattern actually observed for SBT.

Statistical theory provides an additional perspective for evaluating the empirical and simulated estimates of effective size. Expected precision of the LD method can be evaluated using the following equation (*24*)where *n* is the number of pairwise comparisons of alleles and *S* is the number of individuals sampled. Substituting for *n* = 350(349)/2 = 61,075 and *S* = 1500 to approximate the amount of data available for SBT (see Materials and Methods) produces

For the two simulation scenarios, expected CVs for are 0.24 for *N*_{e} = 2 × 10^{4} and 5.7 for *N*_{e} = 5 × 10^{5}. This result shows that relatively precise estimates of effective size can be made even in relatively large populations, provided a sufficient number of genetic markers are available and the ratio *N*_{e}/*S* is not too large; we confirmed this in the simulations using *N*_{e} = 2 × 10^{4}. However, this also illustrates the difficulty in obtaining precise estimates of effective size when true *N*_{e} is orders of magnitude larger than the sample size, as is the case for the larger simulated effective size. In that case, replication across multiple samples (for SBT, five samples each were available for juveniles and adults) can be very valuable to allow comparison of distributions rather than individual point estimates.

## DISCUSSION

The CKMR study for SBT (*16*) obtained a robust estimate of adult abundance that was both higher and much more precise than had been possible with other methods. Here, we have used the same genetic data from the same juvenile and adult samples, together with the life history information for SBT that informed the CKMR estimates of abundance, to generate both genetic and demographic estimates of effective population size (*N*_{e} per generation and *N*_{b} per year). In addition, directly comparing the new data with the CKMR results has allowed us to produce robust estimates of the effective size/census size ratio. Our results illustrate the synergistic benefits of jointly considering demographic and genetic data to study evolutionary processes in large populations. The two approaches have different statistical properties that complement each other to produce robust estimates. The demographic approaches start with an upper limit of *N*_{e}/*N* ≈ 1 and consider various factors that could lower the ratio. However, demographic data generally provide only indirect information about the key age-specific parameter ϕ. If ϕ is very high, *N*_{e}/*N* could be very low, or even in the tiny range (≤10^{−3}). Conversely, genetic methods struggle to distinguish very large *N*_{e} from infinitely large *N*_{e}, but they have much more power to set lower bounds for the range of *N*_{e} that is consistent with the data. Together, the demographic and genetic estimates constrain (bracket) plausible values of *N*_{e}/*N* in SBT to a much smaller range than would be possible using either method by itself.

The demographic analyses accounted for several factors that are rarely considered in estimating effective size in species with overlapping generations: skip breeding, persistent differences in reproductive success, and empirical data for ϕ. We estimate that skip breeding in SBT substantially reduces *N*_{b} but only slightly increases *N*_{e}, and thus does not have an appreciable effect on the *N*_{e}/*N* ratio. Our analyses show that substantial and persistent individual differences in reproductive success can reduce *N*_{e} and *N*_{e}/*N*. However, differences in fecundity at age in SBT based on empirical data are not large enough to appreciably reduce *N*_{e}, even if they persist throughout an individual’s adult life span.

The largest uncertainty regarding the *N*_{e}/*N* ratio in SBT is the value of ϕ. Minimum values of ϕ computed from observed variation in fecundity at age are barely above 1.0 and have little effect on *N*_{e}. Using these minimal estimates of ϕ and other life history parameters reported for SBT, we estimate that the upper bound for the effective size/census size ratio for SBT is relatively high (0.5 < *N*_{e}/*N* < 1). An important factor contributing to this result is delayed age at maturity, which increases generation length and hence *N*_{e} (refer to Eq. 1).

We consider the empirical ϕ estimates minimal because they do not account for other factors that might contribute to individual differences in reproductive success. Lacking any empirical data regarding these other potential factors, we evaluated consequences of assuming a range of possible fixed values of ϕ. We show that *N*_{e}/*N* for SBT could reach the tiny range (≤10^{−3}) only if ϕ is very large (~10^{4}; fig. S1); this in turn would require a rather extreme form of SRS (*21*). The genetic data, which provide independent estimates of *N*_{b}/*N* and *N*_{e}/*N*, are particularly informative regarding uncertainty in ϕ; they also provide insights into some of the less well-studied aspects of SBT biology. If ϕ were as large even as 50 in SBT, *N*_{b} would be reduced to ~1.8 × 10^{4} (Fig. 2 and table S3). This is below the lower 95% confidence bounds for the combined (across years) genetic estimates of *N*_{b} (Fig. 3). Furthermore, 1.8 × 10^{4} is comparable to the smaller of the simulated effective sizes, and when we simulated populations with an effective size of 2 × 10^{4}, the resulting distribution of estimates did not match the empirical genetic estimates of effective size (Fig. 4).

We cannot rule out values of ϕ in the range of 10 or so, which (after accounting for skip breeding) would produce *N*_{e} ≈ 8 × 10^{5} and *N*_{e}/*N* ≈ 0.4. These estimates would be consistent with empirically and theoretically derived *N*_{e}/*N* ratios for a large number of species (*2*, *3*). The combined demographic and genetic results thus suggest that ϕ in SBT cannot be large enough to produce tiny *N*_{e}/*N* ratios comparable to those reported in the literature for some other highly fecund marine species. Similarly, if persistent differences in reproductive success were a substantial factor for SBT, the consequences should be reflected in the genetic estimates, but we do not find such evidence. We therefore conclude that the *N*_{e}/*N* ratio in SBT is above 0.1 and perhaps about 0.5.

It is important to emphasize that our results for SBT do not rule out SRS and tiny *N*_{e}/*N* ratios as potentially important factors for other species. However, it is noteworthy that two of the life history features commonly associated with SRS (broadcast spawning and very high fecundity) are also characteristics of SBT. Thus, our results demonstrate that SRS and very low *N*_{e}/*N* ratios are not an inevitable consequence of these life history traits.

### Caveats

We collected genetic data used in this study from five consecutive years of juvenile and adult samples, which nevertheless represent only a fraction of one generation for SBT. We cannot rule out the possibility that extreme examples of SRS occur occasionally in SBT. However, if these events were common, they should be detectable as a signal of elevated LD, which decays only gradually over a period of several generations, and we did not find evidence for high LD.

The LD method is subject to potential bias due to both age structure and physical linkage of markers. Detailed life history information for SBT allowed us to make a quantitative adjustment to single-cohort estimates of *N*_{b} (see Materials and Methods for details) (*25*). Effects of mixed-age samples on estimates of *N*_{e} are not as well understood, so that adjustment should be considered only approximate. The bias adjustment for physical linkage reflects the expected fraction of pairs of loci that are on the same chromosome or linkage group, which in turn depends on the number of chromosomes (*26*). As the number of loci becomes large, the actual fraction of linked pairs of loci should converge on the expected fraction, and the adjustment should become more precise. In studies like this one with at most a few dozen loci, the actual fraction of linked loci might deviate from the expected fraction, making the bias adjustment less reliable. However, we found no evidence for strong physical linkage in SBT (figs. S3 and S4). Overall, the combined effect of the two bias adjustments we implemented had little net effect (+4%) on genetic estimates of *N*_{b}. Combined adjustments were larger for *N*_{e} because both potential biases are in the same direction. However, 6 of the 10 genetic *N*_{e} estimates were already infinitely large before any adjustments; thus, overall, these adjustments had no appreciable effect on order-of-magnitude conclusions about effective size/census size ratios in SBT.

The AgeNe model (*18*) used for demographic estimates of effective size assumes stable age structure and constant population size, which are unlikely to be strictly true in any natural population. The CKMR study estimated a nonsignificant decline in adult SBT abundance over the period 2002–2010 (*16*). Previous work (*18*, *25*) has shown that AgeNe results are robust to random demographic variation, and a similar model (*27*) accurately estimates *N*_{e} if population size changes at a constant rate.

It is well known [for example, (*24*)] that because genetic methods for estimating contemporary effective size depend on signals that are proportional to 1/*N*_{e}, these methods have difficulty distinguishing very large and infinite effective sizes, and we see some evidence of that in results for SBT. The distribution of genetic estimates of *N*_{b} and *N*_{e} was bimodal, with about half being infinitely large and the rest falling in the range 10^{4} to 10^{6}. This pattern is typical for scenarios where true *N*_{e} is very large and the amount of data (numbers of individuals and numbers of genetic markers) is insufficient to produce high precision for individual point estimates (*21*). Nevertheless, in this study, the combination of (i) large samples of individuals (average >1000 per year for both juveniles and adults), (ii) a relatively large number of highly variable genetic markers, and (iii) replication across five consecutive years produced a distribution of estimates that effectively constrains the range of plausible values for effective population size, particularly on the lower end.

### Conservation implications

The high estimated *N*_{e}/*N* ratio in SBT is generally good news from a conservation standpoint. The species is considered overfished and Critically Endangered by the International Union for Conservation of Nature (*28*). If *N*_{e}/*N* for SBT were in the tiny range, the species could be at some genetic risk, even with estimated abundance of >10^{6} adults. Our results indicate that *N*_{e} is also close to 10^{6}, which indicates that effective size is not likely to be a limiting factor for recovery. Nevertheless, because the amount of genetic diversity a population can maintain (and its ability to respond to rapid environmental change) is strongly influenced by *N*_{e}, careful attention to both *N*_{e}/*N* ratios and the absolute effective size is important.

Our results also emphasize the reality that large samples of individuals are essential if one wants robust estimates of effective size in large populations. Although empirical estimates of *N*_{b} and *N*_{e} for our large samples were bimodal, the lowest point estimates, and almost all lower bounds of CIs, were greater than 10^{4} (Figs. 3 and 4). In contrast, if true *N*_{e} were 10^{6} but only typical sample sizes of 50 to 100 individuals had been used, the expected result is that most of the finite estimates would have fallen in the low hundreds to low thousands (*21*). The large samples used in the SBT study thus constrained the “blind spot” in genetic estimates of large *N*_{e} to a relatively small range. Unfortunately, use of samples of 1000 individuals or more to estimate effective size in populations that are known or suspected to be large is not common [see (*29*) for an exception]. Those interested in obtaining robust empirical estimates of *N*_{e}/*N* in large populations should include large samples of individuals from multiple time periods into the experimental design. In addition, using a combination of life history and genetic data to estimate effective size provides more robust estimates than can be obtained from either method by itself.

The SBT CKMR study began in 2006, and the battery of microsatellite loci used was sufficient to reliably identify parent-offspring pairs. In the past decade, orders-of-magnitude increases have been made in speed and cost-effectiveness of DNA sequencing, and it is now easy to assay many thousands of single-nucleotide polymorphism (SNP) markers, even for non-model species. This technological breakthrough now makes it feasible to expand the CKMR methodology to include reliable identification of siblings (*30*). In addition to providing increased precision for estimating abundance, this also opens up the possibility of adding the sibship method for estimating effective size (*31*) to the genetic toolbox for estimating *N*_{e} in large populations.

## MATERIALS AND METHODS

### Experimental design

Juveniles were sampled in association with a commercial harvest near Port Lincoln, South Australia, where immature fish are captured and raised in net pens before marketing. This fishery captures 2- to 4-year-olds that are distinguishable by size, and we used individual lengths to restrict our collections each year to single cohorts of age 3 fish. Adults were sampled from landings in a long-line fishery in Indonesia, where mixed-age adults are harvested. The current analyses used the same 5 years of samples used for the CKMR estimates of abundance (Table 1; juvenile, *S* = 1316 to 1625 per year; total = 7251; adult, *S* = 212 to 1466 per year; total = 5626). Each juvenile and adult sample was used to provide independent genetic estimates of effective size, and weighted harmonic means (with weights proportional to effective degrees of freedom) were used to obtain overall (across years) estimates for the two types of samples.

### Population demography

Extensive geographic sampling has not produced any evidence of stock structure within SBT (*32*, *33*). We therefore assumed that SBT comprise a single, isolated, randomly mating population. The CKMR study (*16*) reported abundance estimates in terms of biomass of age 10+ fish, whereas for our purposes, we want the number of adults age 8 and older. For the years 2002–2010, the mean estimated number of age 8 SBT recruits was 4.54 × 10^{5} (*23*). Assuming an adult survival rate of 0.77 per year and a maximum longevity of 30 years (table S1), constant recruitment at this rate would produce a population of *N* = 1.97 × 10^{6} adults; hence, this is the value we used for the denominator in the *N*_{e}/*N* ratio.

Calculation of relative age-specific fecundity (*b*_{x}) involved two steps. First, on average, older fish are larger, and larger fish are more fecund, leading to higher fecundity with age in both sexes (Fig. 1). Second, larger females produce more eggs per kilo of body weight. There is no evidence that the same phenomenon occurs in males. Accordingly, relative fecundity for males was assumed to be proportional to weight at age, while relative fecundity for females was proportional to the product of weight at age and per-kilo production of eggs. Because juvenile survival is poorly known for SBT, fecundity was expressed in terms of production of offspring that survive to age at maturity (α = 8 years), and *b*_{x} was scaled to values required to produce a population of constant adult census size (*N*), given the modeled survival rates. Ignoring juvenile mortality affects both effective size and census size but not their ratio, which is of primary interest here. Empirical data for ϕ_{x} are rarely available, and that was also the case here. Empirical data on variance of fecundity at age (table S2) provided a means of calculating a minimum bound for ϕ_{x}, as described in detail in Supplementary Methods.

The model underlying AgeNe (*18*) assumes that reproduction and survival are independent across time periods (reproduction does not affect subsequent survival or reproduction). SBT violate this in at least two ways. First, in analyzing the number of years between parent-offspring pair matches in SBT, Bravington et al. (*23*) found evidence for every-other-year spawning for young (age 8 to 12) SBT of both sexes, but not for older fish; presumably, this occurs because energetic costs of reproduction are high. Consequences of skip breeding for *N*_{e} and *N*_{b} were evaluated using the model developed by Waples and Antao (*22*). On the basis of the skip-breeding data (*23*), we reduced the number of available spawners each year by 50% for ages 8 to 12.

Second, individuals that are unusually large (or small) at an early age are also likely to have positive (or negative) residuals for size at later ages, which would lead to positive correlations in realized fecundity over time. To evaluate consequences of persistent individual differences for SBT, we modeled lifetime reproductive success for the *N*_{1} individuals born into the same cohort. Individuals in the cohort survived randomly each year with probability *s*_{x} and on reaching age *x* produced an average of *b*_{x} offspring, as specified in table S1. As a point of reference, random reproductive success in each time period among individuals in the cohort was modeled by a Poisson process with ϕ_{x} = 1 (equivalent to assuming that all individuals have the same expected fecundity). If larger fish of the same age have higher expected fecundity, ϕ_{x} will be >1, leading to overdispersed variance in reproductive success. From table S2, the median coefficient of variation of age-specific fecundity related to variation in fecundity at age is CV(*b*_{x}) = 0.1. This provides a measure of overdispersion, which we used to calculate ϕ_{x} for use in AgeNe as described above and in Supplementary Methods. However, the AgeNe model assumes that these variations in expected fecundity are random and uncorrelated across years. To model persistent individual differences, for each individual in a newborn cohort, we selected a random normal deviate (*z*) and found its associated probability (rnorm and pnorm functions in R). This characterized the relative expected reproductive success for that individual at every age throughout its life. For example, for a female SBT with *z* = 1.3, pnorm(1.3) = 0.9, indicating that this individual’s expected reproductive success would be in the 90th percentile every year it survived to reproduce. At age 20, SBT females on average produce *b*_{20} = 1.595 offspring that survive to recruitment (table S1). Assuming a CV(*b*_{x}) = 0.1 implies SD(*b*_{20}) = 0.16, and the 90th percentile of the normal distribution with mean = 1.595 and SD = 0.16 is 1.8. Thus, our consistently “above average” fish had expected reproductive success at age 20 of 1.8 offspring that survive to age at recruitment, about 13% above the mean. This value of 1.8 was then used as the parameter for a random Poisson draw to determine how many offspring that individual actually produced in that time period. We kept track of lifetime reproductive output for all members of the cohort and used that information to calculate *V*_{kx} and *N*_{e} using Eq. 1.

### Genetic data

The genetic analyses used 5 years of samples of juveniles and adults scored for up to 27 microsatellite loci [see (*23*) for detailed information about the genetic methodology]. In the CKMR study (*16*), several additional loci were added after the first phase of sampling to boost power to detect parent-offspring pairs. Only ambiguous juveniles sampled in the early phase were subsequently scored for the additional loci. In addition, in all samples, loci scored as missing were re-run only for individuals for which the additional information might have affected parent-offspring status. As a consequence, the distribution of missing data was somewhat uneven across the data set. After dropping two loci not scored in a substantial fraction of individuals, and then dropping individuals missing data for more than 10 loci (fig. S2), we had a data set with 25 loci and 12,877 individuals (5626 adults and 7251 juveniles; Table 1 and table S4).

### Genetic estimates of effective size

We used the single-sample LD method to estimate effective size. The LD method is based on random associations of alleles at different gene loci; these occur in all finite populations and can be quantified by the squared correlation coefficient *r*^{2}, which is inversely related to *N*_{e} (*34*). We calculated mean *r*^{2} and estimated effective size using the program LDNe (*35*), as implemented in NeEstimator V2.1 (*36*). The juvenile samples are from single cohorts and can be used to estimate *N*_{b} (*37*), while the mixed-age adult samples can be used to estimate *N*_{e} (*25*). The LD method estimates effective size in the parents of the individuals sampled (*37*), so these effective size estimates can be compared directly with estimates of adult census size obtained through CKMR.

LD estimates were computed using two different criteria for screening out rare alleles: *P*_{Crit} = 0.02 (all alleles with frequencies <0.02 are omitted) and *P*_{Crit} = 0.01. *P*_{Crit} = 0.02 was recommended as a generally good way to balance effects on precision and bias (*24*), but we also considered the more lenient *P*_{Crit} = 0.01 because of the large sample sizes (more than 1000 individuals/year, except for 1 year of adult sampling). The two *P*_{Crit} values provided a total of 10 point estimates of *N*_{e} and *N*_{b}, along with combined estimates computed as weighted harmonic means across all 5 years. Two of the 25 loci (D569 and D573) had relatively high (>10%) estimated frequencies of null alleles (*16*), so in an additional sensitivity analysis, we repeated the effective size estimates after dropping them. As estimates using 23 versus 25 loci did not differ appreciably (Fig. 3), we focused on the 25 locus analyses in Results. Also, because estimates using *P*_{Crit} = 0.01 used the most data, we considered them the most robust.

We adjusted the raw genetic effective size estimates in two ways to account for age structure and physical linkage. If some pairs of loci are physically linked, LD and *r*^{2} will be higher than expected from drift alone, which will downwardly bias estimated *N*_{e} unless an adjustment is made. Physical linkage can be accounted for if the recombination rate for each pair of loci is known (*34*, *38*) or if all loci can be mapped to their chromosomes, but detailed linkage information is not available for SBT. Instead, we applied a generic correction factor based on the ln (haploid) number of chromosomes (Chr), which determines the fraction of locus pairs expected to be in each linkage category (*26*)(3)

The haploid chromosome number for SBT is 23 (*39*), so we calculated bias-adjusted estimates of effective size as the raw estimate divided by 0.098 + 0.219 × ln(23) = 0.785. The net effect of these adjustments was to increase the point estimates of *N*_{e} and *N*_{b} (and their CIs) by 1/0.785 = 27% to account for physical linkage.

As described in Supplementary Methods, we also examined the SBT genetic data for evidence that certain pairs of loci consistently produced relatively high *r*^{2} values; if so, that could lead to larger bias than was accounted for by the adjustment in Eq. 3. No such evidence was found (see fig. S3 and S4).

Age structure affects single-cohort and mixed-age samples differently (*25*). The juvenile samples are from single cohorts and thus provide information most directly relevant to *N*_{b}, but with some influence from background *N*_{e} per generation. If an estimate of the *N*_{b}/*N*_{e} ratio is available, age structure can be accounted for by dividing the raw estimate by the factor 1.26 − 0.323 × *N*_{b}/*N*_{e} (*25*). For this purpose, we used the demographic estimate of *N*_{b}/*N*_{e} = 0.121 obtained for constant ϕ = 10 (table S3). The age-structure adjustment thus became /(1.26 − 0.323 × *N*_{b}/*N*_{e}) = /1.22, which translated to a reduction of 18% for each point estimate. Mixed-age adult samples are more relevant to *N*_{e} per generation, but they are expected to be downwardly biased because of mixture LD created by combining progeny from multiple cohorts with slightly different allele frequencies. The maximum number of cohorts that could have been included in each adult sample is the adult life span (23 breeding cycles, from age 8 to 30, inclusive), which is 1.7 times the generation length of 13.3 years. Previous work (*25*) has shown that adult LD estimates of *N*_{e} were biased downward by about 25% for iteroparous species for which the ratio of adult life span to generation length is in the range 1.5 to 2. Therefore, we adjusted each raw adult estimate upward by dividing it by 0.75 to account for age-structure effects. The net effect of these two adjustments was an increase of +4% for each raw point estimate of *N*_{b} and an increase of +69% for each raw point estimate of *N*_{e}. We also considered two other genetic estimators of effective size—one based on the incidence of siblings, and the other based on temporal changes in allele frequency—but neither proved suitable for the SBT data sets (see Supplementary Methods for details).

To provide some context for interpreting the LD estimates of effective size, we modified the simulation model described by Waples (*21*) to tune it to mimic the amount of data available for SBT. The number of pairwise comparisons of alleles used in the LD analyses using *P*_{Crit} = 0.01 ranged from *n* = 62,000 to 66,000, which is the number that would be produced by all pairwise comparisons of about 350 to 360 loci with two alleles each. Except for the first year of adult samples, the remaining collections of juveniles and adults all included 1160 to 1625 individuals. Therefore, we simulated genotypes for 350 unlinked, diallelic SNP gene loci in discrete-generation populations of known effective size and estimated *N*_{e} using samples of *S* = 1500 individuals. We modeled two effective sizes: *N*_{e} = 5 × 10^{5} and 2 × 10^{4}, each using 500 replicates. (Results for simulations using *N*_{e} = 10^{6} were qualitatively similar to those for *N*_{e} = 5 × 10^{5} and are not shown.) After initialization, the population was allowed to reproduce randomly for eight generations before sampling.

### Statistical analysis

CIs for genetic estimates of *N*_{b} and *N*_{e} were calculated using NeEstimator V2.1, which incorporates an improved jackknife method (*40*) that accounts for pseudoreplication due to both physical linkage and overlapping pairs of loci. Distributions of for data simulated under the two effective sizes were compared with empirical distributions for and using the nonparametric Kolmogorov-Smirnoff test (ks.test function in R). Infinite estimates were coded as 10^{9} for these comparisons.

## SUPPLEMENTARY MATERIALS

Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/4/7/eaar7759/DC1

Supplementary Methods

Table S1. Life table for SBT, using age at maturity α = 8 and maximum age ω = 30.

Table S2. Illustration of how to calculate effects of variation in fecundity at age on ϕ_{x} (the ratio of variance to mean reproductive success in 1 year for individuals of age *x*).

Table S3. Estimates of effective size in SBT.

Table S4. Microsatellite loci used in the analyses reported here.

Fig. S1. Demographic estimates of the *N*_{e}/*N* ratio for SBT as a function of ϕ_{x} = *V*_{k(x)}/*b*_{x}, which is the ratio of the variance to the mean reproductive success in one time period for individuals of age *x*.

Fig. S2. Distribution of missing loci across individuals.

Fig. S3. Evaluation of evidence for physical linkage.

Fig. S4. Distribution of statistics related to physical linkage for randomized data.

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is **not** for commercial advantage and provided the original work is properly cited.

## REFERENCES AND NOTES

**Acknowledgments:**We thank O. Berry and five anonymous reviewers for valuable comments on the manuscript.

**Funding:**Tenure of R.S.W. in Hobart was supported by a Frohlich Fellowship from CSIRO and by an International Science Fellowship from the National Marine Fisheries Service.

**Author contributions:**P.M.G. and R.S.W. conceived and designed the research. P.M.G. collected the genetic data. M.W.B. led the CKMR study that provided the estimate of adult abundance. R.H. and M.W.B. compiled the life history data. R.S.W. conducted the analyses and wrote the manuscript. All authors read and edited the manuscript.

**Competing interests:**The authors declare that they have no competing interests.

**Data and materials availability:**Data are available from the Dryad Digital Repository (https://doi.org/10.5061/dryad.46g67n8). All other data not presented in the manuscript have been previously published. All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.

- Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC).