In search of opportunity and community: Internal migration of refugees in the United States

See allHide authors and affiliations

Science Advances  07 Aug 2020:
Vol. 6, no. 32, eabb0295
DOI: 10.1126/sciadv.abb0295


At a time of heightened anxiety surrounding immigration, state governments have increasingly sought to manage immigrant and refugee flows. Yet the factors that influence where immigrants choose to settle after arrival remain unclear. We bring evidence to this question by analyzing population-level data for refugees resettled within the United States. Unlike other immigrants, refugees are assigned to initial locations across the country but are free to relocate and select another residence after arrival. Drawing on individual-level administrative data for adult refugees resettled between 2000 and 2014 (N = 447,747), we examine the relative desirability of locations by examining how retention rates and patterns of secondary migration differ across states. We find no discernible evidence that refugees’ locational choices are strongly influenced by state partisanship or the generosity of welfare benefits. Instead, we find that refugees prioritize locations with employment opportunities and existing co-national networks.


In recent years, policy-makers and politicians have engaged in public attempts to manage the internal migration of recent immigrants. For instance, Arizona and South Carolina have proactively restricted access to welfare benefits and social services (1, 2), while several governors and legislators have voiced opposition to accepting refugees and asylum seekers (35). In contrast, other states and cities have engaged in the explicit recruitment of immigrants and have established a range of services and incentives to position their locality as welcoming and supportive (68).

Whether seeking to encourage or discourage the arrival of immigrants, these political moves are motivated by assumptions about the factors that shape immigrants’ locational decisions. Yet this question remains the subject of considerable debate. Scholars have long recognized that established immigrant communities may be self-sustaining and attract future flows by providing prospective arrivals with information about local economic and living conditions. However, the determinants that lead immigrants to initially prioritize one location over another remain poorly understood. One influential perspective has argued that immigrants rationally optimize their residential location on the basis of expected income, including income from welfare benefits (911). Other accounts have minimized economic factors, instead highlighting the role played by local partisanship and immigrant organizations in fostering receptive communities (12, 13).

Regardless of local receptivity, immigrants face challenges obtaining accurate information regarding local policies, employment regulations, and benefits (14). Accordingly, research has increasingly focused on how social networks interact with economic conditions to shape migration patterns. For instance, a prominent strand of literature has highlighted the emergence of new immigrant destinations (1517). Within these locations, industrial restructuring and policy shifts have led to changes in the relative demand for immigrant labor. As a critical mass of immigrant “pioneers” find economic success, they spur further migration via social networks (1821). Consistent with this argument, evidence suggests that the majority of arrivals within new immigrant destinations reside within immigrant enclaves (22), while additional evidence demonstrates local density of co-nationals is an important determinant of subsequent retention.

Prior work has largely focused on the migration patterns of established immigrant groups, who tend to be visible within census data. Yet it remains unclear whether the findings apply to less common nationalities, as well as to immigrants who enter the United States via other pathways. Here, we add evidence to the literature on immigrants’ locational choices by analyzing population-level data on the secondary migration of refugees. For decades, the United States has operated the world’s largest refugee resettlement program, admitting more than 3 million individuals since inception. Unlike other immigrant groups, who may select their initial place of settlement, refugees are directly placed by resettlement agencies and thus exert limited control over their initial location (23). Yet similar to other immigrants, there are no legal barriers prohibiting refugees from moving elsewhere in the United States after arrival. Refugees who enter the United States thus face an immediate decision after arrival: Should they stay in the community selected for them or should they move? By examining the retention rates of various arrival locations across the United States, as well as subsequent migration patterns, we gain insight into the factors that shape refugees’ locational preferences while minimizing selection bias.

In addition to contributing to the literature on the factors that influence immigrants’ locational choices, our findings also provide actionable insights that can inform refugee resettlement policy. The 1980 Refugee Act instructs federal agencies to collect and analyze data on secondary migration, because communities that receive high flows may lack the federal funding, local service organizations, or language competencies necessary to cater to the needs of incoming refugees (24). Unlike other categories of immigrants, who are normally barred from accessing benefits for 5 years, resettled refugees have immediate access to services and benefits under the provisions of the Geneva Convention (25, 26). However, despite the importance of this policy issue (27, 28), congressional reports indicate that the government currently lacks the data necessary to analyze refugee migration patterns or target federal funds and services toward areas that receive secondary migrants (29, 30). The data held by the agency tasked with monitoring secondary migration, the Office of Refugee Resettlement (ORR), are highly aggregated and derived from incomplete state-by-state reporting of refugee enrollment in benefit programs (3133). This introduces coverage bias because the data are limited to refugees who apply for specific state-level benefits (34). In an effort to overcome these administrative data limitations, ORR has fielded nonrepresentative surveys (35), while other researchers have used survey data to impute the refugee status of respondents (36). However, these approaches are subject to nonresponse bias and imputation error and do not permit a comprehensive assessment of refugee migration patterns.

In this study, we provide a comprehensive, individual-level analysis of the internal migration of refugees in the United States. To conduct the analysis, we draw on administrative data that contain background characteristics and locational choices for all refugees resettled in the United States between 2000 and 2014. We leverage the fact that U.S. immigration law requires that refugees apply for adjustment of status to become lawful permanent residents (LPRs) after they have lived in the United States for 12 months. These applications include information on each refugee’s state of residence, allowing us to track whether and where refugees have moved since arriving in the United States. Using unique identifiers, we linked refugee arrival records to data from the Computer Linked Application Information Management System (CLAIMS) and the Electronic Immigration System (ELIS) of the United States Citizenship and Immigration Services (USCIS). After excluding individuals who did not match LPR records or were below the age of 18 at arrival, the final sample covers state locations over time for 447,747 individuals.

The outcome variable is derived from refugee landing and LPR data and indicates whether a former refugee moved from the resettlement arrival state to another state by the time that USCIS received their application to adjust to LPR status. We focus on state-to-state migration to proxy consequential moves, as well as moves outside the initial service area. Details about the measures, sample, design, and statistical analysis can be found in Materials and Methods.


During the period under observation, 17% of refugees relocated from their initial state of arrival by the time they applied for LPR status. This level of state-to-state mobility is significantly higher than available estimates for the population of noncitizens in the United States with approximately 1 year of residency, of which only 3.4% report moving to a new state within the last year [2008–2012, American Community Survey (ACS) 5-year sample] (37).

Figure 1A demonstrates notable heterogeneity in retention rates across arrival states. More than 30% of refugees initially assigned to Louisiana, New Jersey, or Connecticut relocated, while less than 10% assigned to California and Nebraska left their arrival state. Destinations are regionally clustered: As seen in Fig. 1B, which maps flows between states, Midwestern states experienced the largest net gain in refugees following secondary migration, with Minnesota receiving the largest inflows.

Fig. 1 Secondary migration of refugees, 2000–2014.

(A) Proportion of refugees who moved out of their arrival state by the time they had applied for LPR status. (B) Total number of refugees moving in/out of each arrival state by the time they had applied for LPR status. Points above (below) the diagonal represent states receiving a net increase (decrease) in refugees due to secondary migration. (C) Refugee secondary migration flows between states, using the intersection of the top 8 states based on either the numbers arrivals, number of refugees at time of adjustment, or number of net moves. Flows with less than 10 total movers are omitted. All panels focus on the refugees who arrived in those 39 states that received at least 1000 refugees over the study period. N = 443,546.

We next examine how the probability of moving to another state varies by refugee characteristics. Figure 2 reports the marginal effect estimates from a linear probability model that regresses out-migration from the arrival state on refugee characteristics and state and arrival year fixed effects (see Materials and Methods for details). Individuals from Somalia and Ethiopia are most likely to move to a different state after arrival, while refugees from Bhutan and the Democratic Republic of the Congo are the least likely to relocate. The estimated difference in the probability of moving between Somalis and Congolese refugees is about 34 percentage points, indicating stark variation by national origin. Younger refugees and those without families are somewhat more likely to out-migrate, while there is little discernible variation across genders or levels of education. Refugees without existing ties to family members or friends in the United States are 10 percentage points more likely to leave their arrival state, relative to those with such ties. Given that the resettlement program mandates that the latter group are resettled in close proximity—ideally in the same city— to their U.S. tie, this gap is smaller than expected, reflecting high baseline rates of out-migration even among refugees with U.S. ties (12%).

Fig. 2 Probability of out-migration from arrival state, by refugee characteristics.

Coefficients from a linear probability model regressing whether a refugee moved out of her arrival state by the time she had applied for LPR status on refugee characteristics. Models include state, arrival year, and resettlement agency fixed effects and control for time to LPR application. Lines represent 95% robust confidence intervals. Unfilled circles represent reference categories for each attribute. N = 443,546 refugees from the 39 states that received at least 1000 refugees over the study period (2000–2014).

While the individual-level estimates provide insight into how patterns of secondary migration vary across refugee characteristics, the data also permit an analysis of how flows are related to changes in local push and pull factors. Accordingly, we aggregate annual moves, by nationality, between pairs of states and fit a gravity model with state pair fixed effects (38, 39). The gravity model allows us to estimate how variation in conditions in arrival and destination states in the year of the refugees’ arrival predict secondary migration flows within pairs of states (measured on a log scale) while controlling for all fixed characteristics of state pairs (including distance, location, size, etc.) and the annual stock of arrivals (see Materials and Methods for details).

Figure 3 demonstrates that state characteristics predict secondary migration flows. Contra expectations (911, 40, 41), we find that secondary migration flows are not associated with the political orientation of the state’s governor and the generosity of welfare expenditures. However, consistent with the literature on new immigrant destinations, we find a symmetric relationship with the share of co-nationals: a decrease in co-nationals in the arrival state or an increase in a potential destination state are each associated with increased refugee flows. Specifically, a change from the lowest to the highest quintile in the share of co-nationals in a potential destination state is associated with a 21% increase in secondary migration flows to the destination state. In contrast, a similar change in the share of co-nationals in the arrival state is associated with a 16% decrease in secondary migration from the arrival state. The results also suggest a weak symmetric relationship with labor market characteristics. High levels of unemployment in the arrival state are associated with out-migration, while higher unemployment in potential destination states is associated with lower levels of in-migration. For example, a change from the lowest to the highest quintile in the level of unemployment in a potential destination state is associated with a 9% decrease in secondary migration. A similar change in the arrival state’s level of unemployment is associated with a 6% increase in out-migration. These patterns are similar, albeit not statistically significant, for housing costs.

Fig. 3 Expected change in secondary migration, by state characteristics.

Estimates from a gravity model regressing the log total secondary migration flows on sending and destination state characteristics (coarsened into five equally sized bins). Individual-level flows are aggregated to the state-year-origin level. Models control for the initial stock of refugees in each arrival state-year-origin and include state pair and arrival year fixed effects. Lines represent 95% robust confidence intervals, two-way clustered on arrival and destination state. N = 197,343 state-to-state-by-origin flows and includes states that received at least 1000 refugees over the study period (2000–2014).

Several checks support the robustness of the results. We find similar results when extending the out-migration analysis to include the same arrival state characteristics as in the gravity model. In particular, we find that the probability of out-migration is higher for refugees placed in states with fewer co-nationals and higher levels of unemployment (fig. S4). We also find similar results when we focus only on refugees without family ties whose location is assigned by placement officers from the resettlement agencies (fig. S5) or on refugees with family ties who are typically placed in locations near their family ties (fig. S6). Our findings hold when we focus on principal applicants only (fig. S7), include refugees from states that received fewer than 1000 refugees over the study period (fig. S8), or include only refugees who applied for LPR status within 24 instead of within 36 months (fig. S9). Moreover, we obtain similar results when replicating the gravity model using differences in the characteristics between the arrival and destination states as predictors (table S2), using alternative proxies for state welfare generosity (table S3) and alternative proxies for state partisanship (table S4), including states with fewer than 1000 refugees (fig. S10), including only refugees who applied for LPR status within 24 months (fig. S11), or omitting particular origin groups from the sample (table S5).


Together, these results demonstrate that while refugees are a mobile population, patterns of secondary migration are not haphazard and can be predicted by individual and contextual factors. Refugee secondary migration responds to the relative push and pull of local economic conditions and co-national networks in both the arrival state and potential destination states. The former likely reflects the efforts of refugees to find employment opportunities with minimal barriers to entry. The pull of co-national networks is consistent with evidence that social networks shape the internal migration patterns of immigrants, as well as with work that suggests that co-ethnic concentration may provide a softer landing in the form of support networks and employment opportunities (40, 42, 43).

These factors appear to outweigh a common concern among state governments—namely, that immigrants will be attracted by welfare benefits. Although we find that refugees do appear to optimize on the basis of employment, we find no discernable evidence that they move to states with more generous welfare benefits. While this finding would be consistent with a policy environment in which it was costly to obtain information on comparative benefit levels, this constraint is unlikely to apply to the refugee population. U.S. refugee resettlement policy disperses refugees across states via nationally organized resettlement organizations that support refugees in accessing benefits. In this context, refugees could relatively easily acquire word-of-mouth information on comparative benefit levels from co-nationals resettled in other states.

The deliberate assignment of refugees to locations provides an opportunity to observe the retention rates of various arrival states, independent of initial self-selection. As a result, this analysis brings evidence to bear on the factors that shape refugees’ secondary migration patterns in the initial period after their arrival in the United States. Although refugees are a distinct group with a verified history of persecution, our findings suggest that their movement patterns share some similarities with those observed for immigrants admitted under nonhumanitarian programs, with co-national networks and local economic conditions playing an important role. Beyond providing population-level evidence on the factors that influence secondary migration, these findings also have direct implications for refugee resettlement policy. By leveraging the administrative data linkage that we outline in this study, policy-makers would gain the ability to target funds toward communities that receive high proportions of relocating refugees. While refugees can be expected to continue to move to areas where they see opportunity and community, systematically leveraging data on prior migration patterns would enable policy-makers to anticipate likely moves and select a set of initial destinations that maximize the likelihood of successful adaptation to life within the United States.


Our dataset combines two administrative datasets held by the Office of Immigration Statistics. The first dataset, from the Worldwide Refugee Admissions Processing System of the Bureau of Population, Refugees, and Migration of the U.S. Department of State, includes all refugees resettled in the United States from 2000 to 2014. These data contain refugees’ selected sociodemographic characteristics as measured before arrival, including nationality, gender, age, education, family size, the relationship to the principal applicant, and whether the family had existing ties to individuals within the United States. The data also include the date of arrival, the refugee resettlement agency handling the case, and the initial resettlement location.

Refugee arrival records were linked to data from USCIS CLAIMS and USCIS ELIS, which maintains information from applications for LPR status. Refugees are required by statute to apply for LPR status 1 year after admission to the United States. The LPR data contain the date of receipt of the LPR application and the applicant’s location at the time of the submission.

Using the Alien Registration Numbers (A-Numbers), unique identifiers assigned by USCIS, we merged the LPR and refugee arrival datasets. A total of 96% of refugees were successfully merged to the LPR dataset. The remaining refugees who did not match to the LPR dataset might have left the United States, might be deceased, or there might have been inconsistencies in the A-Number records such that they cannot be merged. Figures S1 to S3 show that the match rate is stable across arrival state, origin, and arrival years. We exclude refugees without a matching LPR record from the data since we cannot observe their subsequent locations. We also remove Cubans who arrived in the United States under the protection of specific programs, as well as all refugees below the age of 18 at arrival, under the assumption that these refugees are unlikely to make independent locational choices.

We measure secondary migration by determining whether a refugee has moved from his/her arrival state to another state by the time that he/she applies for adjustment to LPR status. Refugees are required by statute to apply for LPR status 12 months after admission to the United States. The majority of refugees apply for LPR status shortly after they have been in the United States for 12 months, but some take longer to submit their application. To increase comparability, we restrict the sample to refugees who submit their LPR application within 36 months of arrival. This constitutes about 96% of the refugees with a matched LPR record. The refugees who do not adjust to LPR status within this time frame are concentrated among the more recent arrival cohorts. The median refugee submitted their adjustment application 14 months after arrival with an interquartile range of 12 to 18 months, so most refugees apply right after they become eligible to do so. In Supplementary Text, we also run the analysis including only refugees who applied within 24 months; we find similar results when using this shorter time frame (figs. S9 and S11). The rate of secondary migration was roughly stable across years of arrival (fig. S12). We are unable to systematically observe longer-term secondary migration patterns among refugees (e.g., after naturalization) due to administrative data limitations.

The final sample size is 447,747 refugees. In models that include state-level factors, such as the share of co-nationals, we focus on refugees from the top 15 origin countries. For the main analysis, we also focus on refugees who arrived in states that received at least 1000 refugees over the entire 2000–2014 period. We impose this restriction to ensure that the results are not driven by small states that received very few refugees. However, we also replicate the main analysis including all states and find similar results.

To examine the influence of geographic factors, we merged the dataset with state-level data. We obtained unemployment statistics from the Department of Labor’s Local Area Unemployment Statistics. Information on gross domestic product (GDP) and the cost of housing was gathered from the Bureau of Economic Analysis. Data on state welfare spending were drawn from the U.S. Census Bureau’s Annual Survey of State Government Finances and scaled by the Census Bureau’s estimates of the total population below the poverty line. Estimates on co-national shares are drawn from U.S. Census Bureau data. Microdata samples were provided and harmonized by the Integrated Public Use Microdata Series (IPUMS). Estimates between 2009 and 2015 are derived from the 5-year ACS, while estimates from 2000 are from the 2000 U.S. Census (5% sample). Between 2001 and 2008, we interpolate between the 2000 Census and the 2009 5-year ACS to construct yearly estimates. All geographic factors are measured for the year of arrival for a given refugee.

Push model (Fig. 2)

Figure 2 reports the results of a linear probability model where we regress an indicator for whether an individual moved out of her arrival state by the time that USCIS had received her LPR application (1 if yes, 0 if no) on individual-level predictors and fixed effects for state of arrival, year of arrival, and resettlement agency, with SEs clustered by the case indicator that links refugee families. Individual-level predictors include age, gender, education, family size, relationship to principal applicant, nationality, and family member or friend already in the United States (U.S. tie).

To enhance interpretability and avoid strong functional form assumptions, we discretize the continuous predictors as follows: Age at arrival is coded into six bins (18 to 20, 21 to 30, 31 to 40, 41 to 50, 51 to 60, and 60+ years); education is coded into five bins (no schooling/unknown, primary, secondary, postsecondary, and university); case size is coded into four bins (1, 2, 3, and 4+ persons in the family), and for the nationalities, we code dummies for each of the 15 largest refugee nationalities. To ensure that the timing of the move is measured at a similar point in time after arrival, the model also controls for the number of months from arrival to the receipt of the LPR application. This latter variable is also discretized into quintiles.

As a robustness check, in a separate specification, we add arrival state characteristics as predictors. The predictors include the share of co-nationals, cost of housing, unemployment, welfare expenditure per poor individual, GDP per capita growth, and whether the governor is a Democrat. We discretize the variables that measure the share of co-nationals, cost of housing, unemployment, welfare expenditure per poor individual, and GDP per capita growth into quintiles, and we add one dummy variable for each quintile (using the first quintile as the reference category).

Gravity model (Fig. 3)

Figure 3 reports the results of a gravity model with state dyad fixed effects. Individual-level flows are aggregated to the state-year-origin level (i.e., the total number of Burmese arriving in state i moving to state z in year t). We split arrivals by origins to permit an assessment of the role of co-national shares. The dependent variable is the log of one plus the total number of movers, regressed on separate predictors for both the sending and destination state. The model includes dyad fixed effects to control for unchanging characteristics of state pairs as well as year fixed effects and also controls for the stock of arrivals in each state-year-origin. SEs clustered by arrival and destination state using two-way clustering. The state characteristics included in the model are the co-national share, cost of housing, unemployment, welfare expenditure per poor individual, GDP per capita growth, whether the governor is a Democrat, and log total arrivals in each state-year-origin. The model includes separate predictors for arrival state and destination state in each dyad to allow for asymmetric effects (i.e., we do not impose an assumption that a change has the same effect for arrival and destination state). We discretize the continuous predictors into quintiles, and we add one dummy variable for each quintile (using the first quintile as the reference category). Since this model includes state-dyad fixed effects, it controls for all fixed unobserved characteristics of the state pair that might affect moving rates, such as distance, geographic location, etc. The effects of the predictors are identified on the basis of over time changes within the same state dyad. The year fixed effects control for common shocks.

Given that our measures of welfare generosity and receptivity are proxies, we also replicate the gravity model using a number of alternative measures for each to ensure that the results are not a function of our choice of a specific proxy. In addition, we include specifications where the state characteristics are entered in differenced form as the gap between arrival and destination state within a dyad, which imposes a symmetry assumption on the coefficients.


Supplementary material for this article is available at

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank members of the Immigration Policy Lab at Stanford University and N. Zelic for helpful feedback. Funding: We acknowledge that we received no funding in support of this research. Author contributions: N.M., J.F., D.L., J.W., and J.H. designed research; N.M., J.F., D.L., J.W., and J.H. performed research; N.M. analyzed data; and N.M., J.F., D.L., J.W., and J.H. wrote the paper. Competing interests: The authors declare that they have no competing interests. This paper was written by N.M. in her personal capacity. Any opinions and conclusions expressed herein are those of the authors and do not reflect the official policy or position of U.S Citizenship and Immigration Services, the Department of Homeland Security, or the U.S. government. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional individual-level data are controlled by the Department of Homeland Security. Replication code files for the statistical analyses will be posted upon publication online at
View Abstract

Stay Connected to Science Advances

Navigate This Article