Research ArticleSOCIOLOGY

The universal pathway to innovative urban economies

See allHide authors and affiliations

Science Advances  21 Aug 2020:
Vol. 6, no. 34, eaba4934
DOI: 10.1126/sciadv.aba4934


Is there a universal economic pathway individual cities recapitulate over and over? This evolutionary structure—if any—would inform a reference model for fairer assessment, better maintenance, and improved forecasting of urban development. Using employment data including more than 100 million U.S. workers in all industries between 1998 and 2013, we empirically show that individual cities indeed recapitulate a common pathway where a transition to innovative economies is observed at the population of 1.2 million. This critical population is analytically derived by expressing the urban industrial structure as a function of scaling relations such that cities are divided into two economic categories: small city economies with sublinear industries and large city economies with superlinear industries. Last, we define a recapitulation score as an agreement between the longitudinal and the cross-sectional scaling exponents and find that nontradeable industries tend to adhere to the universal pathway more than the tradeable.


Home to more than half of humanity, cities have pushed the boundaries of human productivity, innovation, and success as economic, social, and cultural hubs in an era of rapid urbanization (1, 2). The future development of cities may depend on identifying and maintaining the pathways of urban evolution. To this end, many features of human behavior and society have been modeled by frameworks of universality (35), which suggests that cities may too follow universal trends regardless of regional and historical differences. Identifying these universal trends in urban economic growth may prepare policy makers for the economic and industrial needs of their growing cities.

The development of individual cities has long been considered as idiosyncratic and specialized according to their historical and geographical constraints (613); however, recent studies reveal several regular patterns associating urban characteristics with population size (1, 1421). In particular, many different quantities Y change with city population N according to Y(N) ≈ Y0Nβ (14, 17, 21, 22), suggesting that population size can indicate a substantial number of urban properties under a universal mechanism applied to all cities (14) although precise estimates of β can be difficult (23, 24). This scaling pattern also describes economic properties and further characterizes individual industries by their labor types. For example, cognitive labor–based industries scale superlinearly with population size (β > 1), while manual labor–based industries instead exhibit sublinear scaling (β < 1) (21, 25). These population dependencies may undergird many aspects of the urban economic structure—small cities heavily rely on manual labor while large cities on cognitive labor (21, 2629).

While the cross-sectional scaling relation describes urban economic features as a function of population size (1), it remains to be seen how useful it is when it comes to a city’s longitudinal evolution (3032). To test this, we measure to what extent individual cities recapitulate the pathway universally prescribed by the scaling law. Inspired by biological ontogenetic growth and recapitulation, we call this property “urban recapitulation” (33). Using employment data for U.S. cities from 1998 to 2013, we show empirically how the industrial character of cities changes with urban size and industries’ scaling exponent β.

We demonstrate that the longitudinal change of urban economies indeed follows a universal process governed by changes in population size. By examining the revealed comparative advantage for individual industries given each city, our analysis reveals a transition point from a manual labor economy to a cognitive-based innovative economy at around a population of 1.2 million in the United States. We provide a recapitulation score to measure the deviation of a city’s economy from the cross-sectional scaling by which the universal pathway is revealed.


Explaining innovative economies of large cities

We compare industrial employment and population changes in 350 U.S. cities according to the industries that characterize each city between 1998 and 2013. We measure the importance of industry i to city c according to its revealed comparative advantage (RCA) (3437) (i.e., location quotient) given by rcaci=(Yci/ΣiYci)/(ΣcYci/Σc,iYci) where Yci denotes the employment of i in c. Normalizing employment statistics in this way controls for ubiquitous industries and, thereby, highlights the industries that distinguish urban economies from each other. In particular, an industry is called characteristic if rcaci > 1 (34, 35).

Figure 1A shows characteristic industries in cities of different sizes. Consistent with existing studies (21, 26, 27, 38), small cities are characterized by manual industries such as agriculture and mining, while large cities are characterized by cognitive industries, such as management and professional services. The urban scaling model YciYioNcβi across cities of population size Nc accurately describes employment for most industries (i.e., R2 > 0.65), showing superlinear scaling (i.e., βi > 1) for cognitive industries and sublinear scaling (i.e., βi < 1) for manual labor industries (see Table 1 and the Supplementary Materials). Small cities tend to be characterized by sublinearly scaled industries, while large cities are characterized by superlinearly scaled industries (see histograms in Fig. 1, A and B). We observe a strong correlation between the average scaling exponent of characteristic industries and urban size (Pearson correlation ρ = 0.59).

Fig. 1 City size determines characteristic economic structure.

(A) The characteristic industries of the three smallest, medium, and largest cities in the industry space (see Materials and Methods). Each industry (node) is sized by its comparative advantages and colored by the cross-sectional scaling exponent (β). Every value is averaged over the 16-year time span reflected in our dataset. Empty nodes are noncharacteristic industries. Two industries are connected when they are likely to exist in the same city (ψ > 0.15). The histogram shows the frequency of scaling exponents of characteristic industries in each industry space. (B) The average scaling exponent of characteristic industries in each U.S. city (y axis) compared to population (x axis). (C) We compare the importance of industries (y axis) with different scaling exponents (color) across cities of different sizes (x axis). A vertical section denotes the economic profile of a city of that population. We observe a critical population (N* ≈ 1.2 million) that divides small city economies from innovative large city economies. (D) After ordering U.S. cities in decreasing size (x and y axes), we measure the pairwise Pearson correlation of economic profiles Ic of cities averaged for the entire time span. As in (C), N* corresponds to a critical population size that separates cities based on economic profile.

Table 1 Urban recapitulation is common across industries.

List of recapitulation scores Si, cross-sectional scaling exponents βi, scaled growth coefficients β̂i, and nationwide trends ΔlogYˆio of industry sectors in the order of Si. The cross-sectional scaling exponent is averaged over the 16-year time span. The scaled growth coefficient and the nationwide trend are measured for the difference between 1998 and 2013 according to Eq. 2. The recapitulation score captures how much the scaled growth is associated with the cross-sectional scaling according to Eq. 3.

View this table:

Why should this empirical relationship exist? We explore this question with an analytical model. The importance of an industry in a city rcaci can be expressed as a function of the city’s population using the industry’s scaling exponent βi. Assuming urban population is distributed according to P(N) ∝ N−2 following Zipf’s law (39) (see the Supplementary Materials for further justification), we havercaci(βi1)·Ncβi1Nmaxβi1Nminβi1(1)where Nmax and Nmin are the sizes of the largest and smallest cities, respectively (see the Supplementary Materials for a complete derivation). Following this model, the rcaci of a given industry increases monotonically with population for superlinear industries and decreases for sublinear industries, consistent with the observations in Fig. 1A.

This analysis highlights a transition point that separates urban economies into small city manual labor economies and large city cognitive labor economies. Solving for fixed points in Eq. 1 reveals a saddle point at β* = 1 and N* ≈ 1.2 million, which is around the population of Louisville, KY, the 43rd largest city (see the Supplementary Materials). Empirical observations support this analytical result; Fig. 1C demonstrates an inversion of characteristic industries around N* = 1.2 million by their scaling exponents. Furthermore, the pairwise correlation of economic profiles in Fig. 1D confirms this division into two city clusters at around the 50th largest city (see Materials and Methods for the correlation). Cities below this critical population have “small city” economies represented by manual labor industries with sublinear scaling, while cities above the critical size have “large city” economies represented by cognitive labor industries with superlinear scaling.

Universal pathway recapitulated by individual industries

Thus far, we have observed cross-sectional evidence relating population size and the economic structure of cities, but does this trend also describe the economic evolution of cities over time? In particular, do longitudinal changes in population relate to longitudinal changes in industrial composition governed by industries’ scaling exponents? If so, then the combined cross-sectional and temporal observations would be evidence for a universal pathway recapitulated by urban economies as cities grow.

We illustrate the trajectory of employment versus on population for individual cities to describe their longitudinal dynamics over a 16-year time span. On aggregate, we find strong evidence to suggest that changes in city size are more influential on employment by industry than national labor trends (see table S3), but there is some variation by industry. For some industries, such as education (Fig. 2A), cities are likely to move along the universal pathway given by the scaling relation, whereas other industries, such as manufacturing (Fig. 2B), drift upward or downward with a strong nationwide trend [e.g., expansion of health care (40, 41); the Supplementary Materials contain a full set of industries].

Fig. 2 Cities recapitulate the industrial employment of larger cities.

(A) The trajectory of each city’s Educational Services employment and population size. The scaling relations are denoted by the lines for 1998 (dashed) and 2013 (solid). Arrows depict the change in population and industry size of each city from 1998 to 2013. The detrended trajectory of each city is depicted in the inset. We decompose the nationwide trend and subtract it from the employment growth of each city. (B) Similar to (A), the trajectory of each city by employment in manufacturing. (C) Decomposition of a city’s trajectory. A city’s trajectory (black arrow) can be decomposed into scaled growth (red arrow) and nationwide trend (blue arrow). (D) The recapitulation score for each industry in the 2-digit North American Industry Classification System (NAICS) classification. For industries that are well described by urban scaling (i.e., R2 > 0.65), the average industrial recapitulation score of 0.70 is reasonably high.

We first separate scaled growth effects in cities from nationwide trends to properly estimate longitudinal scaling effects. From the scaling equation YciYioNcβi, changes of industry i employment in city c are decomposed into two partsΔlog Yci(t)Δlog Yio(t)+βiΔlog Nc(t)(2)where Δlog Yci is the total longitudinal change in employment and βiΔlog Nc is the change associated with changes in population size (namely, scaled growth) between the starting year, 1998, and the ending year, 2013, of our dataset. The regression on the observations Δlog Yci and Δlog Nc measures the empirical scaled growth coefficient β̂i and the nationwide trend Δlog Ŷio (see the schematic in Fig. 2C). This scaled growth coefficient denotes the longitudinal scaling effect on the employment change with respect to the population change.

Detrending the nationwide effect reveals universal dynamics of cities along a common pathway, namely, urban recapitulation (see the insets in Fig. 2, A and B). The trajectories of individual cities converge to the common pathway, analogous to other industries with smaller nationwide trends (e.g., education). This convergence shows that longitudinal changes in industrial employment strongly depend on changes in population following the urban scaling relationship between employment and population at a given moment. Urban scaling provides a reference framework for estimating the upcoming urban economic change when population change is expected.

We provide a recapitulation score that quantitatively measures how accurately a cross-sectional scaling exponent estimates scaled growth as followingSi=1β̂iβiβi(3)where β̂i is the scaled growth coefficient, and βi is the cross-sectional scaling exponent. If population changes are perfectly related to employment changes, then we expect the scaled growth coefficient to be equal to the scaling exponent with Si = 1. On the other hand, a recapitulation score equal to zero suggests that population change does not affect employment. Across industries, recapitulation scores tend to be high with an average recapitulation score of S̄i0.7, which suggests that urban recapitulation is common across different U.S. industries. Furthermore, this observation provides a sufficient condition for a constant deviation of scaling relations at different moments (1).

Why do we observe strong recapitulation in certain industries? Traded industries do not necessarily rely on local production and consumption as they sell products in external markets. As a result, its population dependency is weakened and so is its recapitulation score. We observe strong recapitulation in the industries that we locally consume, such as education, retail trade, construction, and utilities, while weak in the tradeable industries, such as agriculture, mining, administrative services, and finance. Therefore, our analysis provides a reliable reference for economic growth based on industry’s tradeability. This finding also explains why studies on industrial clusters and the product space do not focus on the population dependency (34, 42).

Urban recapitulation to innovative economies

In general, how strongly do individual cities adhere to urban recapitulation? We investigate through the aggregated employment by industry in a group of cities of similar size. A group recapitulation score averaged over all industries measures how strongly the economies of a city group recapitulate a universal pathway. We first divide cities into 20 equal-sized groups (i.e., 17 and 18 cities in each group) according to the rank of population size and calculate the ratio of industry i’s employment change to population change averaged for cities in group g. We use β̂gi to denote a group scaled growth coefficient by industry. Similar to the recapitulation score in Eq. 3, the score comparing β̂gi and βi measures the degree of recapitulation in each industry, thus the score averaged over all industries measures the overall recapitulation of city group g. We use Sg to denote a city group’s recapitulation score (see Materials and Methods for mathematical details). Similar to before, Sg = 1 when industries of city group g perfectly recapitulate the trends predicted by cross-sectional urban scaling and becomes zero when population change does not affect their employment changes.

High recapitulation scores for city groups validate urban recapitulation in U.S. cities. Figure 3A demonstrates that strong recapitulation occurs over all city groups as S̄g0.6. Most city groups exhibit a high strength of recapitulation, except for a few groups of very small cities. The consistency with the average of industry recapitulation score S̄i0.7 shows that urban recapitulation is robust for industries and cities. Therefore, as recapitulation is observed in each industry, entire urban economies recapitulate the universal pathway described by scaling relations.

Fig. 3 Urban recapitulation is common across city sizes and leads small cities to follow the economic path of larger cities.

(A) The recapitulation score for each city group using industries with strong scaling relationships. Cities are binned into 20 equal-sized groups according to population size. A longitudinal scaling effect explains about 60% of employment growth in most cities except for a few very small cities. The dashed line denotes the average recapitulation score over all groups. (B) The lead-follow matrix demonstrates increases (red) or decreases (blue) in industrial similarity between cities ranked and grouped by size (i.e., group 1 denotes the largest cities and group 20 denotes the smallest cities). Each cell represents the similarity change over 10 years of an observed city group (y axis) with respect to a reference group (x axis). The positive upper triangle means that smaller cities in the future become more similar to larger cities at the present.

Last, we demonstrate the structural transition of urban economies through the longitudinal change of economic profiles. We first obtain a group economic profile (i.e., RCA values) averaged over cities divided into 20 groups according to their populations. Then, the longitudinal change of correlation of group economic profiles captures the change of industrial similarity between two city groups. For example, if the smallest cities exhibited increasing correlation in the period between 1998 and 2008 with the largest cities fixed in 1998, then the industries of the smallest cities became more similar to the past industries of the largest cities within 10 years. We measure these pairwise similarity changes between city groups and represent them as a lead-follow matrix (see Materials and Methods for the mathematical details).

The lead-follow matrix provides evidence of the universal changes led by large cities and followed by small cities (see Fig. 3B) as would be expected with urban recapitulation. The positive similarity change in the upper triangle shows that small cities become more similar to the past of the larger cities by 2.95% over a 10-year period. On the other hand, the negative similarity change in the lower triangle shows that large cities become increasingly dissimilar to the smaller cities by 4.0% over 10 years. This observation illustrates an overall pattern that large cities lead the economic developmental pathway and small cities follow it. Thus, both small cities and large cities evolve into innovative economies according to the overall increase of populations in the 16-year time span. From the evidences of the group recapitulation score and the lead-follow matrix, we conclude that cities recapitulate a universal pathway paved by scaling relations. Along with the structural difference by population size, an urban economy is expected to become more innovative, creative, and economically desirable as it grows along the universal pathway.


Urban recapitulation helps to explain the economic change of individual cities following a universal pathway governed by scaling relations. Analytical and empirical analyses of the scaling properties lead to a systematic division of small city economies from innovative large city economies at a critical population of 1.2 million. Our recapitulation framework reveals that longitudinal population changes scale employment changes in most industries with a high accuracy (about 70%). As a result, urban economies as a whole evolve into innovative economies through a universal developmental pathway led by large cities and followed by small cities. Overall, urban scaling underpins the structure and the pathway of urban economies, and the population size helps to determine the state of a city.

Our framework is inspired by the recapitulation theory in biology (33) much the same as urban scaling theory (14) is motivated from the allometric scaling in biology (43). However, there is an important conceptual difference between them. In biological recapitulation, similar species recapitulate a shared evolutionary pathway from their ancestors in phylogeny as they grow in their life cycles. On the other hand, cities rarely die nor give a birth. Therefore, it is difficult to define urban phylogeny, not to mention ancestors, and thus we use the term “recapitulation” in a loose definition, in a rather colloquial way: Each city seems to recapitulate evolutionary footprints of the largest city which is prescribed by the scaling relations.

A limitation of our study is the coverage of our dataset which includes only a 16-year-long time window and economies in the United States. Fortunately, this time window includes major economic changes such as the growth of the internet and the digital economy in the 2000s and the Great Recession in 2008 and the recovery (see the Supplementary Materials for the trend of employment). Therefore, the observed urban recapitulation is not constrained to a stationary economic condition. Further studies in different regions other than the United States are needed to hold the universal pathway.

Our findings on the recapitulation and the industrial transition give some political insights on urban economies. First, high-skilled workers can play an important role to maintain urban growth. Although our finding does not suggest any causal relation, it highlights transition to innovative industries with superlinearly increasing skilled workers, coinciding with the conventional wisdom on urban development by educated workers (13, 11). Second, growth of urban employment can be dominated by the industries whose products or services are locally consumed or sold in other cities or countries. As the different levels of recapitulation by industries denotes, governmental support policy on local businesses may not be effective if their products are easily replaceable by the products from other regions. Third, rapidly increasing high-tech industries could intensify the economic polarization between large cities and small cities. Most of innovative industries, e.g., management, professional services, finance, and information, are not very likely to recapitulate as their products or services are not localized. This trend, with the higher risk of automation in small cities (28) and migration of skilled workers to large cities (31), demands labor policies to prepare for the vulnerable employment in small cities.



This study uses employment data by North American Industry Classification System (NAICS) industry code and 350 U.S. metropolitan statistical areas, referred to as “city” throughout the manuscript, according to the U.S. Bureau of Labor Statistics. The sizes of industry set are 19, 86, 289, 642, and 978 according to the depth of classification denoted by 2 to 6 digits. We use 2-digit classification for analyses as default. Data include annual employment measurements for the years 1998 through 2013.

Economic profile and inter-industry relatedness

We characterize industry i as an array of RCA values of cities as Ii = (…, log (rcaci + 1), log (rcaci + 1)…). Similarly, a city is characterized as a profile of industries according to Ic = (…, log (rcaci + 1), log (rcaci + 1), …). The inter-industry relatedness (34, 4446) is defined by the Pearson correlation of two economic profiles Ii and Ij asψ(i,j)=ρ(Ii,Ij)(4)where ρ is the Pearson correlation function. It describes how two different industries are likely to exist in the same cities. We measure the relatedness for each year in our dataset and average them in time. Figure 1A shows links that satisfy ψ(i, j) > 0.15. The pairwise correlation of cities can also be measured from the Pearson correlation ρ(Ic, Ic).

Recapitulation score of city groups

The recapitulation score of a city group is defined by generalizing the decomposition in Eq. 2 for individual cities. We bin cities into 20 equal-sized city groups (i.e., 17 and 18 cities in each group) according to their populations. We can determine the scaled growth of each city by subtracting industry i’s nationwide growth Δlog Ŷio estimated in Eq. 2. Then, the group scaled growth coefficient β̂gi is given as the average ratio of scaled growth (i.e., Δlog YciΔlog Ŷio) to the population change (i.e., Δlog Nc) for the cities in city group g asβˆgi=cg(Δlog YciΔlog Yˆio)·Δlog Nccg(Δlog Nc)2(5)where cg is the set of cities in city group g, and the groups are determined in the order of populations (see the Supplementary Materials for the detailed derivation). A city group recapitulation score is the similarity with the static scaling exponent βi according toSg=11IiIβ̂giβiβi(6)where iI is the set of all industries.

Lead-follow matrix

A lead-follow matrix measures the temporal change of industrial similarity between cities to identify the direction of evolution. As in “Recapitulation score of city groups” section, we bin cities into 20 equal-sized city groups according to their populations and, for each city group, we represent the economic profile Ig(t) of city group g at year t as the average RCA values of each city in group according toIg(t)=(,log (rcaci(t)+1)cg,)iI(7)where I denotes the set of 2-digit NAICS industry codes. Then, the Pearson correlation ϕgg(t, τ) of Ig(t) and Ig(t + τ) shows how the economic profiles of city groups g and g relate over different time periods t and t + τ. A lead-follow score LFgg aggregates these industrial changes over each starting year in our dataset, according toLFgg=T·τ=1Tτ·ϕgg(t,τ)ϕgg(t,0)tτ=1Tτ2(8)where T is the length of aggregation period in years which we set as 10 years as default. This score denotes the average similarity change between reference group g and observed group g, and a lead-follow matrix for the scores of all pairs summarizes the overall trend (see the Supplementary Materials for other time periods).


Supplementary material for this article is available at

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank R. Hausmann, F. Neffke, L. M. A. Bettencourt, and C. Hidalgo for useful comments. Funding: This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2018S1A3A2075175). Author contributions: W.-S.J. and H.Y. designed the research; I.H. and H.Y. performed the research; I.H., M.R.F., and H.Y. analyzed data; and I.H., M.R.F., I.R., W.-S.J., and H.Y. wrote the paper. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested.
View Abstract

Stay Connected to Science Advances

Navigate This Article