The FAM171A2 gene is a key regulator of progranulin expression and modifies the risk of multiple neurodegenerative diseases

See allHide authors and affiliations

Science Advances  21 Oct 2020:
Vol. 6, no. 43, eabb3063
DOI: 10.1126/sciadv.abb3063


Progranulin (PGRN) is a secreted pleiotropic glycoprotein associated with the development of common neurodegenerative diseases. Understanding the pathophysiological role of PGRN may help uncover biological underpinnings. We performed a genome-wide association study to determine the genetic regulators of cerebrospinal fluid (CSF) PGRN levels. Common variants in region of FAM171A2 were associated with lower CSF PGRN levels (rs708384, P = 3.95 × 10−12). This was replicated in another independent cohort. The rs708384 was associated with increased risk of Alzheimer’s disease, Parkinson’s disease, and frontotemporal dementia and could modify the expression of the FAM171A2 gene. FAM171A2 was considerably expressed in the vascular endothelium and microglia, which are rich in PGRN. The in vitro study further confirmed that the rs708384 mutation up-regulated the expression of FAM171A2, which caused a decrease in the PGRN level. Collectively, genetic, molecular, and bioinformatic findings suggested that FAM171A2 is a key player in regulating PGRN production.


Progranulin (PGRN) is a secreted pleiotropic glycoprotein expressed in the central nervous system (CNS) and peripheral tissues. Its deficiency has been associated with the development of frontotemporal dementia (FTD) (1), Alzheimer’s disease (AD) (2), and Parkinson’s disease (PD) (3). Loss-of-function animal models revealed that potential pathogenesis mechanisms may involve neuroinflammation (4), autophagy (5), and cell signaling. PGRN overexpression provided protection against pathological protein deposition and toxicity and inhibited phenotypic progression in AD- and PD-like disease models (3, 6). Recently, PGRN in cerebrospinal fluid (CSF) is found to be increased during the course of AD (7). Together, these lines of evidence indicate that PGRN dysregulation might be involved in the pathogenesis of neurodegenerative diseases and could be a promising therapeutic target. Mounting evidence illustrates that PGRN levels might be modified by genetic variants (8, 9), including the encoding gene GRN, which could partially explain variability of the protein (10). Previous genome-wide screens have identified multiple loci outside the GRN [SORT1 (8) and PSRC1 (9)] as regulators of blood PGRN levels. Nonetheless, to date, no genome-wide association study (GWAS) has been conducted to reveal the genetic modulators of PGRN in CNS. The search for CNS PGRN regulators is of tremendous importance, especially considering that (i) PGRN levels might be regulated differently in the peripheral blood and CNS (11) and (ii) increasing PGRN levels as a neuroprotective approach might be hindered by its peripheral effects of promoting carcinogenesis (12) and obesity (13). This study aimed to identify genetic modifiers of CSF PGRN levels. We first conducted a GWAS of CSF PGRN levels from 1362 adults without dementia and successfully identified an independent locus within the FAM171A2 gene region that showed genome-wide significant association with CSF PGRN levels. Then, we used bioinformatics and cellular approaches to annotate the functionality of the identified variant and investigate its potential mechanisms underlying neurodegenerative diseases. On the basis of these findings, we propose that the FAM171A2 gene is a key regulator of CSF PGRN expression and modifies risk of neurodegenerative diseases.


This study aimed to detect genetic modifiers of CSF PGRN levels and better understand the role of PGRN in biological pathways relevant to neurodegenerative diseases. To achieve this goal, we analyzed genetic data and CSF PGRN levels from 432 individuals without dementia from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) in a GWAS. Then, we sought to replicate our findings in an independent cohort composed of 930 Chinese adults without dementia.

CSF PRGN levels were influenced by sex, educational level, and APOE ε4

In the GWAS cohort, CSF PGRN levels were significantly influenced by sex (P = 3.87 × 10−5, b = 0.003) and potentially influenced by APOE ε4 allele (P = 0.08, b = −8.73 × 10−4) and educational level (P = 0.12, b = −1.77 × 10−4). CSF PGRN levels showed no association with either age (P = 0.48) or clinical diagnosis [healthy controls (HC) versus subjects with mild cognitive impairment (MCI) subjects, P = 0.21].

Common variants in FAM171A2 gene were associated with CSF PGRN levels

A total of 432 individuals without dementia (157 HC and 275 subjects with MCI) participated in the GWAS. Age (P = 1.38 × 10−6) and APOE ε4 carrier percentage (P = 1.37 × 10−6) differed between the two diagnostic groups (Table 1). After adjusting for sex, educational level, APOE ε4 allele, and the first three principal components, no inflation due to population stratification was identified (genomic inflation factor λ = 1.012). A total of 18 single-nucleotide polymorphisms (SNPs) (5 in FAM171A2, 12 in ITGA2B, and rs5848 in GRN) on chromosome 17 were significantly associated with CSF PGRN levels, with rs708384 (genotyped) in FAM171A2 gene yielding the most significant signal (P = 5.09 × 10−12) (Table 2 and Fig. 1A). Located in the transcription factor (TF) binding site (intron 1) of FAM171A2, rs708384 also survived the permutation-based empiric corrections for multiple testing (empiric P = 0.0002; permutation-based corrected empiric family-wise error rate controlled at 0.0002) (Fig. 1B).

Table 1 Characteristics of discovery and replication samples.

CV, coefficient of variation.

View this table:
Table 2 SNPs significantly (P < 5 × 10−8) associated with CSF PGRN levels.

CHR, chromosome; BP, base pair; Alt, altered allele; UTR, untranslated region; NA, not available.

View this table:
Fig. 1 GWAS results and regional plots for associations with CSF PGRN levels.

(A) Manhattan plots [showing the −log10 (P value) for individual SNP] and q-q plot. (B) Association results after permutation test. EMP1, empiric P value; EMP2, permutation-based corrected empiric family-wise error rate. (C) Regional association results for the GRN-FAM171A2-ITGA2B region. (D) Regional association results after controlling for rs708384.

In addition to the loci in FAM171A2, 13 SNPs in ITGA2B and GRN also showed significant associations with CSF PGRN levels (empiric P = 0.0002; permutation-based corrected empiric family-wise error rate controlled at <0.005). Notably, eight loci showed high linkage disequilibrium (LD) (r2 ≥ 0.8) with rs708384. After controlling for the rs708384 genotype, the association signals for all loci were markedly attenuated (P > 0.1), indicating that the relationships in these loci were driven by rs708384 (Table 2 and Fig. 1, C and D). The analysis also identified 149 suggestive loci; 92% were located in FAM171A2, ITGA2B, and GPATCH8 genes of chromosome 17. However, they did not survive after permutation-based empiric corrections (table S1). Accordingly, rs708384 in FAM171A2 gene was selected as the top SNP, for which the minor allele (A allele, MAF = 0.4074) was significantly linked to lower CSF PGRN levels in a dose-dependent manner (P = 3.95 × 10−12) (Fig. 2A).

Fig. 2 CSF PGRN levels as a function of rs708384 genotype in two samples.

(A) The minor allele (A allele, MAF = 0.41) of rs708384 was significantly associated with lower CSF levels of PGRN in a dose-dependent manner. (B) CSF PGRN levels were compared across the AA, AC, and CC genotypes of rs708384 in a larger independent cohort of 930 nondemented Chinese participants to validate the initial observed top signal. A significant association of decreasing CSF PGRN levels with increased minor allele (A) dose of rs708384 was observed, independent of age, gender, education, APOE ε4 genotype, MMSE score at baseline, CV, and rs5848 genotype.

Subgroup and sensitivity analyses

In the subgroup analysis, the minor allele (A) of rs708384 was associated with lower CSF PGRN levels in different strata according to sex (male, P = 2.67 × 10−6; female, P = 1.76 × 10−6) and baseline diagnosis (HC, P = 0.017; MCI, P = 1.36 × 10−11). Notably, the effect size drastically increased in the HC population (beta for HC = 100.9; beta for MCI = 8.77 × 10−5) (fig. S2A). As CSF PGRN levels were reported to be influenced by age, diagnosis, and rs5848 in GRN gene (11), sensitivity analysis was performed to further adjust these confounders. Modifying for age and baseline diagnosis did not significantly change the results (P = 3.51 × 10−12). After adjusting for rs5848, a strong, although less significant, association for rs708384 with CSF PGRN levels was found (P = 8.28 × 10−5) (fig. S2B). Excluding those who developed AD within 3 years from baseline did not change the result, and rs708384 remained the top SNP (n = 359, P = 4.73 × 10−10) (fig. S2C). According to Aβ pathology status, subgroup analysis showed the same dose-response relationship for both groups (P < 5.00 × 10−4 for Aβ-negative group, P = 1.07 × 10−7 for Aβ-positive group). Furthermore, we added the rs5848 genotype, which did not pass the GWAS quality control (QC) stage, as a covariate in the regression model. After adjusting for the rs5848 genotype and other covariates mentioned above, the association of PGRN levels with rs708384 remained significant (n = 337; P = 0.0009).

Specificity of the association with PGRN

We tested whether the observed association may be influenced by other candidate proteins related to neurodegeneration. First, the significance remained unchanged after including CSF levels of Aβ42, T-tau, P-tau, α-synuclein (α-SYN), neurofilament light chain (NFL), and secreted triggering receptor expressed on myeloid cells 2 (sTREM2) as covariates. Next, no correlations with Aβ42 or P-tau biomarkers were identified (r2 ranging from 0.04 to 0.1) although weak correlations were observed of CSF PGRN with T-tau (P = 0.04, r = 0.1), α-SYN (P = 0.0003, r = 0.36), NFL (P = 0.03, r = 0.22), and sTREM2 (P = 1.134 × 10−8, r = 0.27) levels. Furthermore, no associations of rs708384 were revealed with any of the abovementioned proteins when they were separately used as the endophenotype. Together, these results supported the specificity of our results (table S2).

Variability in CSF PGRN levels explained by genetic variants

We used a genome-partitioning analysis to estimate the proportion of variance in CSF PGRN levels explained by chromosomes, significant genes, and loci. All genotyped and imputed variants on chromosome 17 can explain approximately 17.4% of the variability in the CSF PGRN levels (P < 0.05) (Fig. 3A). Most significant loci were in high LD with rs708384 (r2 ≥ 0.8) (Fig. 3B). The SNPs in the GRN region explained 13.3% of the variability in CSF PGRN levels (P = 8.13 × 10−8), suggesting that genetic variants other than the GRN gene also played significant roles. Rs708384 was identified to explain 9.1% of the variability with the smallest P value (P = 2.60 × 10−10) (Fig. 3C). Although rs5848 was not genotyped in ADNI, the 1000 Genomes Project data indicated that rs5848 has only low-to-moderate LD (r2 = 0.6), with rs708384 in the CEU (Utah residents with Northern and Western European ancestry) population (Fig. 3D) and no loci showed LD with rs5848. Thus, it can be reasonably implied that the observed associations for other significant loci were more likely to be influenced by rs708384.

Fig. 3 Variability in CSF PGRN levels explained by genetic variants.

(A) Chromosome 17 explained approximately 17.4% of the variability in the CSF levels of PGRN. (B) Most of the significant loci were in LD with rs708384. (C) SNPs in the GRN region explained the most but not all of the variability in CSF PGRN. Analysis of FAM171A2 and ITGA2B region showed that these two regions explained 9.1 and 5.6% of the variability in CSF levels of PGRN, respectively. rs708384 was shown to explain 9.1% of the variability. (D) rs5848 was only in low-to-moderate LD with rs708384 in CEU population (r2 = 0.6), and no loci were in LD with rs5848. Nonetheless, rs5848 was in high LD with rs708384 in CHB population (r2 ≈ 0.8).

Replication in an independent cohort and meta-analysis

To validate our finding that rs708384, as the top signal, was associated with CSF PGRN levels, we measured CSF PGRN levels and genotyped rs708384 in a larger, independent cohort of 930 Northern Han Chinese individuals without dementia (Table 1). In this cohort, the participants’ age ranged from 40 to 88 years (mean = 62.6 years, SD = 10.4 years) at the time of CSF extraction. Both sexes were well-represented, including 380 men and 550 women. The mean CSF PGRN levels was 1739 pg/ml (SD = 372 pg/ml), with mean intrabatch coefficient of variation (CV) of 2.95% and mean interbatch CV of 3.82%. Consistent with the result of the discovery cohort, the A allele of rs708384 was significantly associated with lower CSF PGRN levels in a linear regression model accounting for age, sex, educational level, APOE ε4 genotype, MMSE (mini-mental state examination) score at baseline, and CV (P = 7.47 × 10−9, Fig. 2B).

To exclude the potential influence of rs5848 in the GRN gene, we genotyped and included it as a covariate. Single regression analysis indicated that rs5848 was significantly associated with CSF PGRN levels (P < 0.005). However, the significance of rs5848 was no longer present (P = 0.88) when rs708384 was added as a covariate. In contrast, the association of rs708384 with the CSF PGRN level still reached borderline significance when controlling for rs5848 (P = 0.10). This may be because rs5848 showed relatively high but not complete LD with rs708384 in the Chinese population (r2 ≈ 0.8, Fig. 3D). The additive model (AA versus CC) indicated that the association for rs708384 remained significant (P < 0.05) after controlling for rs5848 and other covariates (table S3). These results demonstrated that the influence of rs708384 on the CSF PGRN level was not determined by rs5848. The meta-analysis of the two-stage findings further strengthened the association of rs708384 with the CSF PGRN level (P = 1.74 × 10−18). We identified that the loci previously associated with blood PGRN levels had no significant relationships with CSF PGRN level, suggesting that the genetic mechanisms of PGRN regulation may be different in the peripheral system and CNS (table S3).

Functional annotation of the top signal in the FAM171A2 gene region

Enhancer enrichment analysis showed that the tagged variants by rs708384 were significantly enriched in specific brain regions (e.g., middle hippocampus, inferior temporal lobe, and prefrontal lobe) (table S4). This suggests that the variants might be linked to the regulation of gene expression in these sites. We also found significant differential expression of FAM171A2 in the temporal cortex (P = 0.004) in AD (fig. S3). Moreover, the expression quantitative trait loci (eQTL) analyses showed that rs708384 (z = 4.74; P = 2.5 × 10−6) and some of its tagged variants can potentially influence GRN expression in both the blood (table S5) and specific brain regions (anterior cingulate cortex and frontal cortex) (table S6). In addition to GRN, rs708384 was also associated with the expression of FAM171A2 and ITGA2B in multiple sites, such as the frontal lobe (table S7). Together, the evidence points to the hypothesis that rs708384, a TF binding site, may functionally modulate the expression of GRN and its downstream gene FAM171A2. We identified that the expression of GRN was highly correlated with that of FAM171A2 (P = 1.1 × 10−17), in either the brain (P = 5.3 × 10−10) or blood (P = 3.1 × 10−6).

FAM171A2 and GRN genes were clustered in common pathways

There was a significant excess of enriched categories: 96 categories in the Protein Analysis Through Evolutionary Relationships (PANTHER) analysis and 107 categories in the Consensus PathDB (CPDB) analysis. There were 34 significant categories that overlapped in the two analyses (table S8). These categories were primarily related to regulation of nervous system development, molecular transport, signal transduction, cell-to-cell adhesion, and response to stimuli (Fig. 4A). The gene network analysis revealed a total of 31 significant categories (P < 0.001) forming four clusters of genes (Fig. 4B). As expected, the most significant genes (GRN, FAM171A2, and ITGA2B) were clustered together (Fig. 4B). The Gene Ontology (GO) items for this specific cluster included all the abovementioned categories (P < 0.001) and another two items: regulation of blood pressure (P = 3.84 × 10−4) and interleukin-8 secretion (P = 4.20 × 10−4).

Fig. 4 GO and pathway analysis.

(A) Functional categories were identified that were significantly enriched (P < 0.001), primarily including those involving regulation of nervous system development, molecular transport, signal transduction, cell-cell adhesion, and response to stimuli. GTPase, guanosine triphosphatase. (B) Four clusters were identified in the gene network analysis. The most significant genes (FAM171A2 and GRN) were clustered together. (C) The highest z-scores were achieved by FAM171A2 and GRN. z-score was used by GeneNetwork Assisted Diagnostic Optimization to prioritize the candidate genes: Gene with a higher z-score is more likely to explain the phenotype.

Among 16 pathways [12 from REACTOME and 4 from Kyoto Encyclopedia of Genes and Genomes (KEGG)] that achieved the statistical significance level (P < 0.005), the L1 cell adhesion molecule (L1CAM) interaction pathway was the most significant (REACTOME, P = 2.49 × 10−6), for which the highest z-scores were achieved by FAM171A2 and GRN gene (Fig. 4C). Most other significant pathways were also primarily related to cell-to-cell interactions (table S9).

FAM171A2 considerably expressed in the cerebral vascular endothelium and microglia

It was essential to determine the location of FAM171A2 in the brain before investigating its functions. Therefore, we conducted immunohistochemistry (IHC) and immunofluorescence (IF) staining on brain sections (cortex and hippocampus) of 10-week-old male C57BL/6 mice. Evident DAB (Diaminobenzidine) staining was observed around the vessels (Fig. 5, A1 and B1, tilted arrows) and on cells in a similar form to microglia (Fig. 5, A2 and B2, horizontal arrows). The IF staining of FAM171A2 colocalized with CD31 and IBA1 on the brain sections (Fig. 5, C1, C2, D1, and D2). Five independent experiments were conducted, and the abovementioned results indicate that FAM171A2 was considerably expressed in the cerebral vascular endothelium and microglia.

Fig. 5 FAM171A2 high expression on cerebral vascular endothelium and microglia.

(A and B) The IHC staining of FAM171A2 on mouse cortex and hippocampi. The DAB staining along and around the cerebral vascular (A1 and B1) was marked by tilted arrows. The DAB staining on the cells in a similar form of microglia (A2 and B2) was marked by horizontal arrows. (C and D) The IF staining of FAM171A2 with the CD31 and IBA1 antibodies on mouse cortex and hippocampus. Their colocalization was marked by “*” and “#.” n = 5 mice in these experiments.

Targeting rs708384 regulates expression of FAM171A2 and GRN genes in vitro

First, we conducted dual-luciferase reporter assay in human embryonic kidney 293 (HEK293) cells to evaluate the regulation of rs708384 to FAM171A2 (Fig. 6A). A total of four independent experiments were conducted. We observed that the c > a mutation significantly improved the ratio of firefly/Renilla luciferase reporter gene expression compared to the wild type. This indicates that c > a stimulated the expression of FAM171A2 (Fig. 6B).

Fig. 6 rs708384 stimulates FAM171A2 expression and subsequently inhibits GRN/PGRN level.

(A) The structure of firefly luciferase reporter plasmid. The sequence containing rs708384 was labeled by a red square. (B) The plasmids with or without rs708384 (c > a), including the empty control, had different expressing levels of firefly luciferase after transfection into the HEK293 cells. The luminous intensity was calibrated by Renilla luciferase, and the result was presented by ratio of firefly/Renilla luciferase. n = 4 per group, the data were analyzed by the one-way analysis of variance (ANOVA) (P < 0.0001) followed by the Tukey post hoc test (P = 0.0002 c > a versus wild type). (C and D) The change of intracellular GRN after FAM171A2 overexpression. n = 5 per group, the data were analyzed by t test (P = 0.0002). (E) The change of the supernatant PGRN levels by FAM171A2 overexpression (OE). n = 5 per group, the data were analyzed by t test (P < 0.0001).

Next, we examined the influences of FAM171A2 overexpression on GRN/PGRN levels. Cultured human umbilical vein endothelial cells were used, and five independent experiments were conducted. A significant decrease in the intracellular GRN level (Fig. 6, C and D) and the supernatant PGRN level (Fig. 6E) was observed after FAM171A2 overexpression, which was consistent with the GWAS data. Meanwhile, FAM171A2 was knocked down by small interfering RNA (siRNA), and we observed a significant increase in the intracellular GRN level (fig. S4, A and B). The efficiency of FAM171A2 siRNA and overexpression were verified by Western blot analysis (fig. S4, C and D). These results showed that FAM171A2 overexpression led to decreased GRN/PGRN levels, while knockdown had the opposite effect.

Impact of rs708384 on phenotypes of neurodegenerative disease

Last, we tested whether rs708384 had relationships with Aβ pathologic features, CSF biomarkers of neuronal injury, cognitive functions, brain structure/metabolism using the ADNI database, and the risk of neurodegenerative diseases [including AD, PD, amyotrophic lateral sclerosis (ALS), and FTD] using previous published GWAS summary statistics. The “AA” genotype of the rs708384 was significantly linked to higher CSF tau levels (fig. S5A) and poorer global cognition (fig. S5, B and C). There were no significant influences of rs708384 genotypes on Aβ pathologic features, CSF P-tau levels, and brain structure/metabolism of the regions of interest. Ten variants were tagged by rs708384 with r2 ≥ 0.8. In the large-scale GWAS dataset, these loci were significantly related to AD, PD, and FTD risk, with effect allele (A) of rs708384 associated with increased risks [P = 0.01 for AD (14), P = 1.42 × 10−5 for PD (15), and P = 0.04 for FTD (16)]. No significant relationships of rs708384 or its tagged loci were found for ALS (table S10).


It has previously been shown that loci on genes other than GRN could modify blood PGRN levels (8, 9). However, no previous GWAS has been conducted to report the genetic modifiers of PGRN in CSF. In the present study, we conducted a genome-wide analysis of CSF PGRN levels and observed an independent signal in the FAM171A2 gene. Collectively, evidence from genetic, molecular, and bioinformatic findings suggests that the FAM171A2 gene is a key regulator of PGRN production.

There were several sources of evidence that support the identified genome-wide locus as a real and specific signal and not an artificial type I error. First, rs708384 showed a highly significant P value and survived the permutation-based empiric corrections for multiple testing (Fig. 1). Moreover, the locus was directly genotyped, eliminating the possibility that the signal was a result of an imputation error. Second, the association was successfully replicated in a larger, independent sample. Third, both bioinformatics and in vitro evidence confirmed that rs708384 could modulate the expression of the GRN and FAM171A2 genes. Fourth, FAM171A2 and GRN were closely connected and may functionally act together in the human brain. Last, the genotype of rs708384 was associated with the CSF biomarker and risk of neurodegenerative diseases. The finding was consistent with that of a recent study in which PGRN deficiency was reported to increase AD risk by influencing tau rather than Aβ pathology (17).

It has recently been found that PGRN expression was up-regulated in patients with AD compared to controls (18). Thus, participants with dementia were excluded from the present analysis to preclude the potential selection bias due to case ascertainment. Moreover, sensitivity or subgroup analysis showed that the results were significant in both Aβ-positive and Aβ-negative groups. Subgroup analysis indicated that the influence of rs708384 on CSF PGRN levels was vastly greater in HCs than in patients with MCI.

GRN and FAM171A2 are clustered in common pathways that are potentially involved in regulating CSF PGRN levels. The most significant is L1CAM interactions. LICAM is an axonal glycoprotein that is important for neurite outgrowth and neuronal survival (19). Although previous animal studies depicted this molecule to reduce AD pathology (20), it is still unclear how it contributes to the development of neurodegenerative diseases via interaction with PGRN. In addition, our GO analyses included categories related to oxidative stress, transport system of neurons, sensory perception of smell and sound, and endocytosis. Previous studies demonstrated that all these biological processes were associated with neurodegenerative diseases, especially AD, PD, and FTD. Future research is necessary to explore the roles of PGRN in these processes and how these may influence the progression of neurodegenerative diseases.

Several limitations should also be noted. The validation cohort is based on Han Chinese individuals recruited from the hospital, which leads to constrained generalizability. Future studies are required to replicate the association in community-based cohorts of other races. As a critical covariate in the GWAS stage, rs5848, which did not pass the QC filter, was imputed but not genotyped. We provided multiple forms of evidence to lower this risk: (i) rs5848 was in low-to-moderate LD with rs708384 in the CEU population, suggesting that the potential influences were limited; (ii) the consistency test showed a high imputation accuracy (n = 427, k = 0.874) for the rs5848 locus; and (iii) we found that the association of rs708384 with the CSF PGRN level was only slightly influenced by genotyped rs5848, in either ADNI or Chinese Alzheimer’s Biomarker and LifestylE (CABLE). The in vitro evidence suggests that rs708384 could influence expression of FAM171A2, which showed close relationships with expression of the GRN gene. However, the specific mechanisms have yet to be studied. In the following research, we plan to further investigate the specific mechanisms of FAM171A2 to regulate PGRN. We will also focus on the role of FAM171A2 in the expression of PGRN within microglia and various models of neurodegeneration diseases. Further investigation is needed to determine whether the effect size of the top locus on CSF PGRN varies by the diagnostic group, although the present study focused on individuals without dementia.

Therefore, we found evidence to show that FAM171A2, downstream of GRN, is a novel genetic regulator of PGRN production. Future studies are warranted to fully understand FAM171A2 function in the brain and the mechanisms for which FAM171A2 affects PGRN levels. Targeting FAM171A2 may potentially modify neurodegenerative disease risk by regulating brain PGRN levels.


Participants for GWAS

In the discovery cohort, 432 (HC = 157, MCI = 275) non-Hispanic white nondemented individuals were included from the Alzheimer’s Disease Neuroimaging Initiative GO/2 (ADNI-GO/2). Table 1 showed the summarized characteristics of this sample. Data used in the preparation for GWAS were obtained from the ADNI database ( The original cohort with CSF PGRN and GWAS data comprised 508 participants. We confined our analysis to non-Hispanic white individuals (n = 434) to constrain the risk of bias from population stratification. We tested unanticipated duplicates and cryptic relatedness among samples using pairwise genome-wide estimates of proportion identity-by-descent by the PLINK software (beta 6.4) (21). The genome-wide complex trait analysis (GCTA) was used to calculate principal component and confirm the ethnicity of the samples (22). Two outlier participants were removed, and 432 were finally included (fig. S1). The study was approved by institutional review boards of all participating institutions, and written informed consent was obtained from all participants or authorized representatives.

CSF PGRN measurements

CSF PGRN levels were measured by a previously reported sandwich immunoassay (7) using the Meso Scale Discovery platform (23). All CSF samples were distributed randomly across plates and measured in duplicate. All the antibodies and plates were from a single lot to exclude variability between batches. Experiments were performed by experienced operators blinded to clinical information. The mean intrabatch CV was 2.2% (with all duplicate measures presented CV < 15%) and the mean interbatch CV was 4.21%. PGRN levels were corrected by interbatch variation, and corrected values were used for analyses.

Genotyping and imputation

The ADNI-GO/2 samples were genotyped by the Human OmniExpress BeadChip (Illumina Inc., San Diego, CA). Before association analysis, all samples and genotypes underwent stringent QC with the following criteria: call rates for SNPs > 95%, call rates for individuals > 95%, minor allele frequencies > 0.2, and Hardy-Weinberg equilibrium test P > 0.001. The original dataset included a total of 710,618 genotyped variants, including rs7412 and rs429358, which were genotyped separately by an APOE genotyping kit to define the APOE ε2/ε3/ε4 isoforms, as previously described (24). Following the standard procedure, imputation was conducted using Beagle software (version 5.0) with the HapMap GRCh37 as reference. SNPs with a Beagle R2 < 0.8, a call rate < 95%, a minor allele frequency < 0.2, and Hardy-Weinberg equilibrium test P < 0.001 were removed. Last, the filters and QC produced a total of 3,262,988 imputed and genotyped SNPs for analyses.

Statistical methods

Because the CSF PGRN values were skewed (Shapiro-Wilk test P < 0.05), data transformation was performed to achieve a normal distribution using “car” package of R. A linear regression model was used to determine the associations of CSF PGRN levels with genetic polymorphisms using the PLINK v1.9 with an additive genetic model. Stepwise linear regression analysis was used to determine whether CSF levels of PGRN were influenced by diagnosis, age, gender, education, APOE ε4 allele, and CV of PGRN measurements. Those with P < 0.2 were included as covariates. To correct for confounding by genetic ancestry that could lead to population stratification, the first three principal components of a genetic relationship matrix between pairs of individuals were further included as covariates. Given that 30% of nondemented elders met the biomarker criteria for AD (25), subgroup analyses were first conducted for Aβ pathological status (A positive versus A negative) according to A-T-N criteria, for which A positive was defined as positive evidence of cerebral Aβ deposition as detected by PET (AV45 > 1.11) or CSF (Aβ < 192 pg/ml) (26). We further excluded those who progressed to dementia within 3 years since baseline. We also performed sensitivity analyses including rs5848 (GRN, imputed), age, and diagnosis as covariates because it was previously shown that levels of PGRN might be influenced by these factors (11, 18). Analyses stratified according to gender and diagnoses were conducted to examine the strata effect.

The specificity of the association with top signal (the SNP with the smallest P value) was explored to examine whether the association was influenced by other CSF proteins linked with neurodegenerative diseases, including Aβ, p-tau181 (P-tau), total tau (T-tau) (27), α-SYN (28), NFL (29, 30), and sTREM2 (31, 32). The measurement data of all these proteins were downloaded from ADNI: (i) The associations of top signal with these proteins were separately explored via single variable linear regression; (ii) the relationships between normalized levels of CSF PGRN and other CSF proteins were estimated using Pearson’s product moment correlation coefficient; (iii) these CSF proteins were respectively included as covariates in the model. Statistical analyses and data visualization were performed in R v3.5.1, PLINK v1.9 (21), GCTA v1.91.6beta (33), and LocusZoom v1.3 (34).

Bonferroni-corrected statistical significance was defined as P < 5 × 10−8, and P < 1 × 10−5 was considered suggestive association. To identify additional independent genetic signals, conditional analyses were performed by adding the top SNP as a covariate and testing all remaining regional SNPs for association. As an additional alternative to exclude possible false-positive results, the PLINK max (T) permutation test with 5000 permutations was used to generate empiric P values for multiple testing correction. Genome-wide associations were visualized with the R package “qqman”; genome partitioning was conducted using the algorithm GCTA to estimate the proportion of phenotypic variance explained (35).

Replication cohort

We used an independent replication cohort from nondemented northern Han Chinese elders who were derived from CABLE study. The study has been approved by the Institutional Review Board of Qingdao Municipal Hospital. Since 2017, CABLE is an ongoing large-scale study majorly focused on AD’s risk factors and biomarkers in Chinese Han population. CABLE is aimed to determine the genetic and environmental modifiers of AD biomarkers and their utility in early diagnosis. Individuals were recruited at Qingdao Municipal Hospital, Shandong Province, China.

All enrolled participants were Han Chinese aged between 40 and 90 years. The exclusion criteria include the following: (i) CNS infection, head trauma, epilepsy, multiple sclerosis, or other major neurological disorders; (ii) major psychological disorders; (iii) severe systemic diseases (e.g., malignant tumors); (iv) family history of genetic diseases. All participants underwent clinical and neuropsychological assessments, biochemical testing, as well as blood and CSF sample collection. Demographic information and medical history were collected via a structured questionnaire and an electronic medical record system.

CSF PGRN levels in CABLE were determined with the Human PGRN Enzyme-Linked Immunosorbent Assay (ELISA) kit (Biovendor-Laboratornimedicina A.s. Czech Republic) on the microplate reader (Thermo Fisher Scientific Multiskan MK3). Samples were diluted 10-fold and run in duplicate. The association for the most significant locus (top SNP) in the discovery stage was chosen for validation. Several loci were selected for genotyping with restriction fragment length polymorphism technology, including rs708384 (top SNP for validation), rs5848 (important confounder in GRN), two loci related to APOE ε4 status (rs7412 and rs429358), and three loci (rs660240, rs4747197, and rs646776) associated with blood PGRN by previous GWAS. Until October 2019, eligible measurements of CSF PGRN and genotyping have been available for 930 nondemented individuals. Data analysis was performed with linear regression via R, adjusting for age, sex, education, APOE ε4 status, MMSE at baseline, and CV of PGRN measurements. Sensitivity analysis was conducted by including rs5848 as covariate. Meta-analysis was performed to pool the P values of the two-stage findings using “metap” in Stata/SE 12.0 software. In addition, we also examined whether the loci associated with blood PGRN also influenced the levels in CSF.

Bioinformatics analyses

SNP annotations were conducted using the NCBI Database of SNPs (dbSNP, GRCh37/hg19 assembly, 105 release) ( (36), SNPnexus ( (37), and SNP and CNV Annotation Database (SCAN) ( The potential regulatory functions were examined using HaploReg v4.1 (38), RegulomeDBv1.1(build 141 of dbSNP) (39), and 1000 Genomes Project ( LD analyses were conducted on the basis of data from the 1000 Genomes Project [EUR and Han Chinese in Beijing, China (CHB)]. Variants tagged by the top SNP were used to perform the following enhancer enrichment analysis and expression analysis.

Enhancer enrichment analysis was conducted via HaploReg v4.1 to evaluate in which cell types the tag variants were significantly enriched. Enhancers were defined using Roadmap Epigenomics data with four different methods (including the 15-state core model, the 25-state model incorporating imputed epigenomes, the H3K4me1/H3K4me3 peaks, and the H3K27ac/H3K9ac peaks) (40). A binomial test was performed using all 1000 Genomes variants with a MAF > 5% in any population as the background set. A total of 28 blood cells and 13 brain cells were selected for the present analysis.

To validate the result of enhancer enrichment analysis and characterize these associations, expression analyses were performed to determine (i) whether the expression levels of genes where the identified significant loci are located are associated with AD case-control status. The differential expression of identified genes in AD blood (41) and brain (42) (fig. S3) was investigated by analyzing the gene expression profiles from Gene Expression Omnibus (GEO) via NCBI web tool GEO2R ( and AlzData web server ( (43). P < 0.05 after adjustment for multiple testing was regarded as statistically significant (44); (ii) whether SNPs associated with CSF PGRN also affect the protein-encoding gene (GRN) expression levels in brain and blood; (iii) whether the significant SNPs were associated with expression levels of the candidate genes within each locus. To these end, eQTL analyses were conducted using multiple publicly available datasets in human brain tissues [including brain expression GWAS eQTL datasets (45) and Genotype-Tissue Expression (GTEx) (46)] and the whole blood [including Blood eQTL browser (47), Consortium for the Architecture of Gene Expression browser (48), NCBI Molecular QTL Browser Search database (49), and Framingham Heart Study eQTL database (49)]. The eQTL results were also searched from 1000 Genomes Project. (iv) Pair-wise gene expression correlation analysis was performed using public RNA sequencing data of 31,499 samples ( and GTEx expression data via GEPIA web server ( (50).

GO overrepresentation analyses and pathway analyses

The Association List GO Annotator method (51) was used to search for the GO terms and KEGG pathways. SNPs were clumped if they were within 1 Mb and in high LD (r2 ≥ 0.8) with the index SNP (defined as P < 1.0 × 10−4). These SNPs were mapped to genes using SNP annotation tools mentioned above. SNPs were mapped to a gene if they were situated within 20 kb of that gene; genes were only counted once irrespective of how many SNPs were mapped to the gene. Last, 34 genes were included.

GO overrepresentation analyses were performed using the PANTHER statistical overrepresentation test v9.0 ( (52) and the CPDB overrepresentation gene set analysis Release 33 ( (53). Both used data from the GO Consortium ( and calculated overrepresentation of candidate genes, relative to different background (18,043 genes for CPDB and 20,814 genes for PANTHER). Of the 34 genes in the analysis, CPDB recognized all and 29 were assigned to at least one GO term by PANTHER. The resulting P values are corrected for multiple testing using the false discovery rate method in CPDB (P < 0.05). As for PANTHER, Bonferroni-adjusted (six separate tests) P values were used (P < 8.33 × 10−3).

In addition, multiple sources of pathway sets were used, including KEGG (1 August 2018, (54), PANTHER v8.1 (52), Reactome pathways, v65 (13 June 2018, (55), WikiPathways (, and Prioritization And Functional Assessment ( (56). Last, the function of gene network was predicted on the basis of the functional enrichment of its coregulatory partners using the GeneNetwork Assisted Diagnostic Optimization tool (

In vitro and in vivo experiments

Luciferase reporter plasmid and antibodies. The plasmid was purchased from Genomeditech Co. Ltd. Shanghai, China. Briefly, we cloned an about 500-bp region of the FAM171A2 intron 1 containing SNP rs708384 or wild-type site and inserted the cloned fragment into a pGL3-promoter vector, which contained the SV40 promoter upstream of the firefly luciferase reporter gene. Empty pGL3-promoter vector was transferred as a systemic control. Antibodies used were as follows: anti-FAM171A2 antibody (1:20 for IHC and IF; ATLAS, HPA019770); anti-FAM171A2 antibody (1:500 for Western blot; Proteintech, 20836-1-AP); anti-CD31 antibody (1:250 for IF; ab24590), anti-granulin antibody (1:1000 for Western bolt; ab108608); anti–β-actin antibody (1:1000 for Western bolt; Affinity, AF7018).

Animals. Eight-week-old C57BL/6 mice were purchased from Beijing Vital River Laboratory Animal Technology Co. Ltd. and housed in 23° ± 1°C, 55 ± 5% humidity and 12-hour light/dark cycles (lights on between 07:00 and 19:00) with libitum food and water. Mice were euthanized for IHC and IF staining (n = 5) after adaptation for 2 weeks. The study was performed with the approval of the Institutional Animal Care and Use Committee of Fudan university (approval number: JS-256). The experiments were performed according to the National Institutes of Health Guide for the Care and Use of Laboratory animals.

FAM171A2 knockdown and overexpression. The in vitro FAM171A2 knockdown was achieved by siRNA, purchased from GenePharma Co. Ltd, Shanghai, China. Sequence: 5′-GCAAUGGCACUGGUGUAAUTT, AUUACACCAGUGCCAUUGCTT-3′. The FAM171A2 overexpression plasmid was purchased from GenePharma. The pEX-3 vector was used for plasmid construction. Both FAM171A2 siRNA and plasmid were transferred into human umbilical vein endothelial cells (HUVECs) by Lipofectamine2000. The effect of knockdown and overexpression was confirmed 48 hours after transfection.

Cell culture. The HUVECs we used were purchased from iCell, Shanghai, China. This cell line was immortalized by lentiviral transformation in 2018 from human primary umbilical vein endothelial cells, which was confirmed by CD31 IF staining. HUVECs was cultured in extracellular matrix + 5% fetal bovine serum (FBS) + 1% endothelial growth factor + 1% penicillin/streptomycin (P/S). The HEK293 cell was purchased from Chinese Academy of Sciences Cell Bank. The cells were cultured in Dulbecco’s modified Eagle’s medium + 5% FBS + 1% P/S. The incubator maintained an environment of 37°C and 5% CO2. Cells were subcultured by 1:3 until a convergence of 80%.

Dual-luciferase reporter assay. The assay was conducted according to the technical manual of Promega Dual-Luciferase Reporter Assay System. The firefly and Renilla luciferase reporter plasmid (5:1) was cotransferred into HEK293 cells by Lipofectamine2000. After 24 hours, cells were homogenized by the passive lysate buffer provided by the manufacture. Luciferase Assay Reagent II and Stop and Glo Reagent were added to the lysates sequentially. Luminous intensity was detected by the Synergy H1 hybrid reader using the luminescence model.

Immunohistochemistry. Mice was injected by pentobarbital sodium into deep anesthesia and perfused with saline solution followed by 4% (w/v) paraformaldehyde. Hippocampi and cortex were frontally sectioned using a freezing microtome, and 30-μm serial sections were collected. Sections were incubated in blocking buffer [0.01% Triton X-100 and 20% normal goat serum in phosphate-buffered saline, (pH 7.4)] for 1 hour and were subsequently incubated at 4°C overnight in rabbit anti-FAM171A2 primary antibody, followed by 2 hours in secondary biotinylated antibody, at room temperature. After three washes, sections were incubated in avidin-biotin complex (ABC Standard, Vector Laboratories) and color developed by using 0.025% 3,3′-diaminobenzidine and 0.1% H2O2. Then, the sections were stuck on microslides and immersed into hematite dyeing solution for 1 min followed by deionized water washes. After treated with ethanol and dimethylbenzene, the sections were covered by neutral gum and slips. An Olympus camera (DP72; Olympus) was used to take images.

IF staining. The process was similar to IHC. The brain sections were incubated by the rabbit anti-FAM171A2 antibody together with the mouse anti-CD31 antibody and shaken overnight. After washing, sections were immersed into goat anti-rabbit Alex Fluor 488 and goat anti-mouse Alex Fluor 647 (FAM171A2 and CD31) or donkey anti-rabbit Alex Fluor 488 and donkey anti-goat Alex Fluor 594 (FAM171A2 and IBA1) for 2 hours, followed by washing and imaged using the Olympus FV10 laser scanning confocal microscope.

Western blotting analysis. Thirty-microgram protein for each sample was added to a lane of 10% SDS–polyacrylamide gel electrophoresis gels. After electrophoresis, polyvinylidene difluoride membranes were transferred by constant current of 250 mA for 90 min, and then blocked with 5% bovine serum albumin for 1 hour, and incubated in the rabbit anti-granulin antibody overnight. After washing, membranes were incubated with horseradish peroxidase–conjugated secondary antibodies for 2 hours. The blots were visualized using the Super Signal West Pico Chemiluminescent Substrate (Thermo Fisher Scientific Inc.). The grayscale was measured by ImageJ. All experiments were performed in triplicate. The final data were expressed as a ratio of the relative optical density (OD) of the protein of interest to that of β-actin.

Enzyme-linked immunosorbent assay. We used the R&D PGRN Quantikine ELISA kit to detect the supernatant PGRN content. The experiment was conducted according to the manual of the ELISA kit. Briefly, 50 μl of cell lysate was added per well and incubated for 2 hours at room temperature. A 200-μl human PGRN conjugate was added to each well after washing for four times with 400 μl of wash buffer per well and incubated for 2 hours at room temperature. A 200-μl substrate solution was added to each well subsequently and incubated for 30 min at room temperature. Last, the reaction was ended by 50 μl of stop solution per well. Within 30 min, the OD of each sample was detected at 450 nm and corrected by wavelength at 540 nm. PGRN contents were calculated from OD according to protein standards.

Statistics. The data were presented in the means ± SEM. GraphPad Prism 8 software (version 8.0.2, GraphPad Software Inc., La Jolla, CA, USA) was used to test the difference via the one-way analysis of variance (ANOVA) followed by the Tukey post hoc analysis and unpaired t test. P < 0.01 was considered statistically significant.

Associations of the top signal with neurodegenerative diseases

The influences of the top SNP on cognition, brain Aβ deposition, FDG, and volume/thickness of brain regions of interest (hippocampus, parahippocampus, posterior cingulate, precuneus, cuneus, entorhinal cortex, and middle temporal region) were further explored using ADNI. Those who developed AD in the first 3 years were excluded. Linear regression models were conducted to determine the associations with the above indexes at baseline. Subgroup analysis based on the diagnosis (HC versus MCI) was conducted. R packages “lm” was used to perform the above analyses. In addition, previous GWAS for AD (14), PD (15, 57) (, ALS (58), and FTD (16, 5962) was searched for associations of the top signal with risk of neurodegenerative diseases.


Supplementary material for this article is available at

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank all the participants and their families, as well as many institutions and their staff that provided support for all studies involved in this collaboration. We also thank the participants, researchers, ADNI, and staff associated with many other studies from which we used data for this report. This work was made possible by the generous sharing of GWAS summary statistics. We thank the International Genomics of Alzheimer’s Project (IGAP) for providing summary results data for these analyses. The investigators within IGAP contributed to the design and implementation of IGAP and/or provided data but did not participate in analysis or writing of this report. IGAP was made possible by the generous participation of the control subjects, the patients, and their families. The in vitro and in vivo experiments were conducted in the center laboratory of Jing’an District Central Hospital. Funding: This study was supported by grants from the National Natural Science Foundation of China (91849126), the National Key R&D Program of China (2018YFC1314700), Shanghai Municipal Science and Technology Major Project (no.2018SHZDZX03) and ZHANGJIANG LAB, Tianqiao and Chrissy Chen Institute, and the State Key Laboratory of Neurobiology and Frontiers Center for Brain Science of Ministry of Education, Fudan University. The i-Select chips were funded by the French National Foundation on Alzheimer’s disease and related disorders. EADI was supported by the LABEX (laboratory of excellence program investment for the future) DISTALZ grant, Inserm, Institut Pasteur de Lille, Université de Lille 2, and the Lille University Hospital. GERAD was supported by the Medical Research Council (grant 503480), Alzheimer’s Research UK (grant 503176), the Wellcome Trust (grant 082604/2/07/Z), and German Federal Ministry of Education and Research (BMBF): Competence Network Dementia (CND) grants 01GI0102, 01GI0711, and 01GI0420. CHARGE was partly supported by the NIH/NIA grant R01 AG033193 and the NIA AG081220 and AGES contract N01-AG-12100, the NHLBI grant R01 HL105756, the Icelandic Heart Association, and the Erasmus Medical Center and Erasmus University. ADGC was supported by the NIH/NIA grants U01 AG032984, U24 AG021886, and U01 AG016976, and the Alzheimer’s Association grant ADGC-10-196728. Data collection and sharing for this project were funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (NIH grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd. and its affiliated company Genentech Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co. Inc.; Meso Scale Diagnostics LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the NIH ( The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. Author contributions: W.X. and J.-T.Y. analyzed the GWAS data, performed bioinformatics analyses, and wrote the manuscript. S.-D.H. and C.M. designed and performed cell-based studies. C.Z. helped design in vitro studies and draft the manuscript. W.X., J.-Q.L., C.-C.T., and H.-Q.L. performed the QC for the GWAS data. W.X., J.-Q.L., C.-C.T., Q.D., L.T., and J.-T.Y. provided CSF PGRN and genetic material for the replication analyses. W.X., J.-Q.L., and C.-C.T. measured and QCed the CSF PGRN. ADNI provided data. Q.D., L.T., and J.-T.Y. supervised and wrote the project. All authors read and approved the manuscript. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors. All ADNI data are available through the LONI Image and Data Archive (IDA) and interested scientists may apply for access on the ADNI website ( IGAP database URL: PDGene database URL:

Stay Connected to Science Advances

Navigate This Article