Resurrecting the ancient glow of the fireflies

See allHide authors and affiliations

Science Advances  02 Dec 2020:
Vol. 6, no. 49, eabc5705
DOI: 10.1126/sciadv.abc5705


The color of firefly bioluminescence is determined by the structure of luciferase. Firefly luciferase genes have been isolated from more than 30 species, producing light ranging in color from green to orange-yellow. Here, we reconstructed seven ancestral firefly luciferase genes, characterized the enzymatic properties of the recombinant proteins, and determined the crystal structures of the gene from ancestral Lampyridae. Results showed that the synthetic luciferase for the last common firefly ancestor exhibited green light caused by a spatial constraint on the luciferin molecule in enzyme, while fatty acyl-CoA synthetic activity, an original function of firefly luciferase, was diminished in exchange. All known firefly species are bioluminescent in the larvae, with a common ancestor arising approximately 100 million years ago. Combined, our findings propose that, within the mid-Cretaceous forest, the common ancestor of fireflies evolved green light luciferase via trade-off of the original function, which was likely aposematic warning display against nocturnal predation.


Over the centuries, the bioluminescence of fireflies has attracted much attention as a charming seasonal sight, particularly in Asia, and, recently, as a useful diagnostic tool in the biomedical sciences (1). It is proposed that firefly bioluminescence originated as an aposematic warning display toward predators and later acquired a role in sexual communication for many firefly species (2).

Fireflies belong to the beetle family Lampyridae, which is composed of 10 subfamilies containing around 2200 recognized species across the world (3). Luminescent beetles are additionally found in the families Phengodidae, Rhagophthalmidae, and Elateridae. These four families are members of the superfamily Elateroidea, and it has been considered that Phengodidae and Rhagophthalmidae are sister groups of the Lampyridae and share a common origin of bioluminescence; conversely, bioluminescence in Elateridae evolved independently of the Lampyridae-Phengodidae-Rhagophthalmidae lineage (4).

The molecular system of bioluminescence is shared among these four families; the chemical structure of the luminescent substrate, d-luciferin, is considered identical for all luminescent beetles, and the luminescence reaction is catalyzed by homologous luciferases (>48% amino acid identity) in the presence of O2, adenosine 5′-triphosphate (ATP), and Mg2+ (Fig. 1A) (4). Despite the commonality in enzymatic reaction and components, luminescence color can vary widely between species. The European glowworm Lampyris noctiluca, for example, emits green light, the North American Big Dipper firefly Photinus pyralis yellow-green light, and the Japanese lesser firefly Luciola parvula orange-yellow light. The differences in luminescence color are considered to be the consequence of evolutional strategies for warning predators and attracting mating partners more effectively (2, 5).

Fig. 1 Firefly bioluminescence and color evolution.

(A) Coleopteran bioluminescence reaction. (B) Molecular phylogeny of luciferases and related enzymes. The leaf nodes are labeled with species name, protein name, and GenBank accession number. Branches are labeled with bootstrap probability (1000 reconstructions). The resurrected ancestral nodes are shown as a square. The leaf nodes are indicated with in vitro luminescent colors (green, yellow-green, yellow, orange, or red) judged by the luminescence maximum values in references (table S3). The resurrected ancestral nodes are indicated with in vitro luminescence colors judged by the combination of luminescence maximum (Table 1) and perceived coloration. Luciferases without spectral data and fatty acyl-CoA synthetase (nonluciferase) are denoted with white and gray circles, respectively. The 18S rRNA–based species trees (fig. S6) are superimposed onto the subtrees of Luc1 type (red lines) and Luc2 type (blue lines) to confirm consistency between the gene/protein phylogeny and the species phylogeny. (C) Photographs of the luminescence of seven ancestral luciferases in 96-well plate. The camera exposure time for each well is uneven to avoid the changes in coloration by overexposure. Luminescence of AncElat was not photographed even by longer exposure. Luminescence of AncCanth was categorized as yellow on the basis of λmax value but appears in orange on the photograph probably because of its broad spectrum.

Fossil records of fireflies are limited, but a single adult male firefly was mined from Burmese amber dating back to 100 million years (Ma) ago, which exhibited an obvious photophore structure on the abdominal segment (6). This specimen strongly suggested that the ancestral firefly of the Cretaceous period was already bioluminescent and used light emission for some purpose. Naturally, the color of ancestral bioluminescence is not evident from fossil records, however, making it difficult to predict the original function of the firefly light.

The colors of light in fireflies are regulated by the luciferase structure (7) and, probably, not by either the luciferin molecule or the effects of color filters (like the blue transmission filter of the hatchetfish photophore) (8, 9) and not by fluorescent substances (like Aequorea green fluorescent protein in jellyfish). For this reason, in vivo firefly luminescence colors match principally those of the in vitro luminescent reaction of the luciferase with d-luciferin (911). Thus, the evolutionary history of firefly bioluminescence is traceable by using the sequence data of extant luciferases.

Now, two types of firefly luciferase genes, Luc1 type and Luc2 type, have been isolated from more than 30 species in various subfamilies, most of which are Luc1 type genes; but recently, Luc2 type genes have also been isolated from several species of two major subfamilies, Lampyrinae and Luciolinae. The gene duplication occurred at the basal position of the lineage in Lampyridae, thus suggesting that probably all fireflies have both Luc1 and Luc2 (4, 10, 11). Gene expression profiles have suggested their subfunctionalization—that is, extant Luc1-type luciferase is responsible for luminescence in green-yellow of the lantern in larvae, pupae, and adults, while extant Luc2-type luciferase is responsible for the dim glow in green of oviposited eggs, the pupal body, and the ovaries (9, 11). On the basis of these premises, we recreated putative ancestral firefly luciferases by predicting their amino acid sequences with the maximum likelihood method of ancestral state reconstruction (12) and experimentally characterized the enzymatic properties including the luminescence colors.


Ancestral sequence reconstruction of firefly luciferases

To reconstruct the ancestral states, the molecular phylogenetic tree was constructed for a total of 53 amino acid sequences of beetle luciferases and close homologs (black line cladogram in Fig. 1B). The tree topology was mostly consistent with that of the species tree (red and blue line cladograms in Fig. 1B) based on the total evidence approach (2, 3). To trace back the evolution of firefly luciferase properties, we targeted seven key ancestral genes encoding the ancestral luciferases of Elateroidea (AncElat); the Lampyridae-Phengodidae-Rhagophthalmidae lineage (AncCanth); Lampyridae (AncLamp), Luc1-type luciferase of Lampyridae (AncLuc1); Luc2-type luciferase of Lampyridae (AncLuc2); Luc1-type luciferase in Lampyrinae, a major subfamily of Lampyridae (AncLampn1); and Luc1-type luciferase of Luciolinae, another major Lampyridae subfamily (AncLucin1) (Fig. 1B). Monophyly of all ancestral nodes are supported with >99% bootstrap values, except for AncElat (62%) and AncCanth (41%).

The ancestral luciferase sequences were inferred on the basis of the molecular phylogeny of the extant amino acid sequences (fig. S1). The average posterior probability of the amino acid residues in the seven reconstructed ancestral luciferases were sufficiently high, ranging from 0.984 to 0.915 (Table 1 and fig. S2). Although 2 to 2 l sites were estimated rather uncertainly, most of these sites distributed distant from the active sites (7) of luciferase with a few exceptions (fig. S3 and table S1).

Table 1 Reconstructed ancestral luciferases.

View this table:

Luminescence properties of ancestral firefly luciferases

The seven codon-optimized ancestral gene sequences were synthesized and cloned into an expression vector. The recombinant proteins expressed in Escherichia coli were purified using a cobalt chelating column (fig. S4). The luminescence activities of the purified recombinant proteins were measured in the presence of d-luciferin, ATP, and Mg2+, showing a similar intensity to those in the extant luciferases (Luciola cruciata LcLuc1, LcLuc2, and P. pyralis PpyLuc1) except for AncCanth and AncElat, for which the luminescence activities were lesser and at trace level, respectively (Fig. 2A).

Fig. 2 Biochemical properties of ancestral luciferases.

(A) Integration of luminescence intensities from 2 to 32 s. The values are shown by relative light unit per nanogram of protein (RLU/ng). White and black bars represent the mean with standard deviation values at 50 and 200 μM of d-luciferin, respectively (n = 3). (B) Luminescent spectra of seven recombinant ancestral luciferases and three extant firefly luciferases. (C) ACS activities of seven ancestral luciferases and three extant firefly luciferases to lauric acid are shown. Bars represent SE of the means (n = 3 to 6).

The color of light produced by fireflies can be approximated from the λmax value of the spectrum, although not defined because of broad spectral bandwidth. In this study, we provisionally report the coloration of the light produced by luciferin-luciferase reaction as a visual description “green” for λmax 520 to 549 nm, “yellow-green” for λmax 550 to 559 nm, “yellow” for λmax 560 to 584 nm, and “orange” for λmax 585 to 619 nm, which are approximately aligned with our visual perception.

Measurement of the luminescence spectra for seven ancestors showed that AncElat and AncCanth emitted in the orange-red region (λmax 594 and 584 nm respectively); AncLuc1, AncLuc2, and AncLamp emitted in the green region (λmax ≤550 nm), which was most hypsochromic shifted (the peak shifted to shorter wavelength) in seven ancestors; and AncLampn1 and AncLucin1 showed a yellow-green to yellow (λmax 553 and 563 nm), respectively, (Table 1 and Figs. 1C and 2B).

Acyl-CoA synthetic activities of ancestral firefly luciferases

Firefly luciferases are bifunctional enzymes; in addition to luciferase (luminescence) activity, they have a “promiscuous” acyl-CoA synthetic (ACS) activity to various fatty acids (Fig. 1A) (1315). Consequently, it has been hypothesized that the firefly luciferases originated from a fatty acyl-CoA synthetase (15). We examined ACS activities of the seven ancestral luciferases using lauric acid as a substrate.

Results showed that ACS activity of AncElat was an order of magnitude higher than other ancestral and extant luciferases, demonstrating specific activity of protein (1126.1 ± 9.6 nmol/min per milligram) (Fig. 2C). The activities of other ancestral luciferases were rather low as ranging from 140.4 ± 3.2 (AncCanth) to 15.1 ± 2.7 (AncLamp), and these values were comparable with that of the extant luciferases ranging from 145.0 ± 28.0 (LcLuc1) to 9.7 ± 1.6 (LcLuc2).

Crystal structure of ancestral firefly luciferases

The ancestral firefly luciferase AncLamp was cocrystallized with the reaction-intermediate analog 5′-O-[N-(dehydroluciferyl)-sulfamoyl]adenosine (DLSA), and the crystal structure [Protein Data Bank (PDB) ID 6K4C] was determined to a 2.1-Å resolution (Fig. 3A and table S2). The overall structure of AncLamp was well conserved with LcLuc1 (PDB ID 2D1S), which shows 79% amino acid sequence identity and 1.64-Å root mean square deviation for 535 Cα atoms with AncLamp (7).

Fig. 3 Structure of AncLamp.

(A) The structure of AncLamp-DLSA complex (green) is superposed on that of Luciola cruciata LcLuc1-DLSA complex (white). DLSA is represented as stick model. The all ligand omitting Fo-Fc map (contoured at 6 σ) is shown in blue. The positions of N and C termini are indicated. (B) Detail of the substrate-binding site of superposed AncLamp-DLSA (green), LcLuc1-DLSA (white), and AncLamp-substrates (d-luciferin/ATP) (light green) complex structures. The residues differing between AncLamp and LcLuc1, Ile236 (Val239 in LcLuc1), Ile240 (Val243), and Leu285 (Ile288), and those in close contact with these residues, Phe246 (Phe249) and Thr238 (Thr241), are shown in stick models. The molecules of the DLSA and probable intermediate, luciferyl-AMP (AMP-Luc*), are shown as stick models colored by element (carbon atoms are colored in white, gray, and light gray for AncLamp-DLSA, LcLuc1-DLSA, and AncLamp-substrates complex structures, respectively). (C) Amino acids of the ancestral luciferases and LcLuc1 at the sites discussed in the text. The residue numbers of AncLamp are indicated above the alignment. See fig. S1 for original alignment. (D) Close-up view of the substrate-binding site of AncLamp (left) and LcLuc1 (right) in van der Waals model. The different residues between two proteins, i.e., Ile236 (Val239 in LcLuc1), Ile240 (Val243), and Leu285 (Ile288), and those in close contact, Phe246 (Phe249) and Thr238 (Thr241), are shown. The distances from Cε1 atoms of Phe246 (Phe249) to DLSA are indicated in red.

The DLSA molecule was in close contact (at least one atom of amino acid residue existed less than 3.5 Å from DLSA atoms) with His244, Phe246, Thr250, Ser313, Pro317, Gly338, Tyr339, Gly340, Leu341, Thr342, Thr361, Asp421, Ile433, Arg436, and Lys528 in the AncLamp-DLSA complex structure. These substrate-binding residues are generally conserved with LcLuc1. The conformations of bound DLSA molecule were nearly identical between AncLamp and LcLuc1 as showing 0.38-Å root mean square deviation for 40 DLSA atoms in the superposed structures (Fig. 3B).

A noticeable difference in DLSA interaction between AncLamp and LcLuc1 was observed in proximity of the residues, namely, Ile236 (Val239 in LcLuc1), Ile240 (Val243), and Leu285 (Ile288) (Fig. 3C). Mainly because of the packing of Cδ1 atom of Ile240 and Cδ2 atom of Leu285, the phenyl ring of Phe246 was slightly extruded toward the substrate-binding cavity. The distance from the Cε1 atom of Phe246 (Phe249 of LcLuc1) residue to the thiazole plane of DLSA (defined by atoms C8-S9-C4) was 3.18 and 3.43 Å in the AncLamp and LcLuc1 structures, respectively (Fig. 3D). Consequently, the space for luciferyl moiety appeared to be narrower in the AncLamp structure. The cavity volume for luciferyl moiety was 599 Å3 in AncLamp, which was notably smaller than 719 Å3 in LcLuc1.

The ancestral firefly luciferase AncLamp was also cocrystallized with genuine firefly luminescence substrates, namely, ATP and d-luciferin, and the crystal structure was determined to a 1.7-Å resolution (PDB ID 6K4D). Although the protein structure was almost identical to that of the AncLamp-DLSA complex, the electron density map at the substrate-binding site demonstrated an intriguing feature, suggesting the presence of the reaction intermediate, most probably a luciferyl–adenosine monophosphate (luciferyl-AMP). The conformations of the substrate-interacting residues were similar between the complex structures DLSA and luciferyl-AMP complex. The conformation of luciferyl-AMP was also very close to that of DLSA as showing 0.42 Å root mean square deviation for 40 atoms (Fig. 3B). The cavity volume for luciferyl moiety was 558 Å3. Thus, the space elimination for luciferyl moiety, which was observed for the intermediate analog (DLSA) complex, appeared to be consistent for a genuine intermediate molecule.

In the crystal structure of AncLamp with ATP and d-luciferin, substantial residual density was observed after building the luciferyl-AMP model, and it was interpreted to be unreacted d-luciferin. These residual densities fit well to sulfur or oxygen atoms when unreacted d-luciferin was built into the substrate-binding cavity (fig. S5B). Unexpectedly, the unreacted d-luciferin was trapped in the substrate-binding cavity upside-down for reaction (the reactive carboxylate moiety was pointing to the bottom of cavity), apparently showing unproductive configuration and suggesting an immature specificity for d-luciferin. No structure of luciferase complexed with d-luciferin or with d-luciferin and ATP has been reported so far, and this was the first observation of the substrate configuration for luciferase. Therefore, significance of the observed d-luciferin configuration for the evolution of acyl-CoA synthetase/luciferase activity was inconclusive at this point of time.

Molecular clock analyses

For dating the ancestral luciferases, a phylogenetic tree of Elateriformia species was constructed on the basis of the 18S ribosomal RNA (rRNA) sequences. The geological ages of the ancestral luciferases (corresponding to the speciation nodes), except for AncLamp (common ancestor of duplicated genes of Luc1 and Luc2-type luciferases), were estimated with the Bayesian Markov chain Monte Carlo (MCMC) method by referring to the fossil date 152 ± 10 Ma ago of the most recent common ancestor of Elateroidea as the prior probability distribution (Table 1 and Fig. 4).

Fig. 4 Dating of ancestral species.

Geological dating of the ancestral species (concestors) (50) bearing the ancestral luciferases based on the molecular clock analysis of 18S rRNA genes of Elateriformia species. The nodes corresponding to the hypothesized concestors, which genomes encoded AncElat, AncCanth, AncLampn1, and AncLucin1 are indicated as ConElat, ConCanth, ConLampn1, and ConLucin1, respectively. ConLuc1/2 is the concestor of Lampyridae, which had the duplicated genes of AncLuc1 and AncLuc2 as the first time and AncLamp as the last time (Fig. 1B). The median of estimated geological age and 95% HPD (light blue bar) from the MCMC estimations (fig. S6) are indicted on each node. The geohistorical positions of the oldest firefly fossil with abdominal photophore structure (6) and gene duplication between Luc1-type and Luc2-type genes are indicated on the phylogeny or geological time scale.

The oldest ancestral luciferase homolog of the present study, AncElat, was dated back to 115.85 Ma ago with 95% highest posterior density (HPD) of 84.82 to 145.26 Ma ago. As demonstrated in the luminescence activity assay, AncElat was nonluminescent. It suggested that the common ancestor of Elateroidea (indicated as ConElat, the common ancestor bearing AncElat gene, in Fig. 4) did not use bioluminescence for any purpose until the emergence of the Lampyridae-Phengodidae-Rhagophthalmidae lineage (ConCanth).

AncCanth, which demonstrated lesser luminescence activity to the extant luciferases, belonged to the last common ancestor of Lampyridae, Phengodidae, and Rhagophthalmidae. Thus, it might define the oldest date of bioluminescence activities in the Lampyridae-Phengodidae-Rhagophthalmidae lineage. Molecular clock analyses suggested that the Lampyridae originated in the mid-Cretaceous (16). This is consistent with the estimation of geological age of ConCanth of the present study, which suggested 102.55 (95% HPD of 71.06 to 132.83) Ma ago (Fig. 4). This dating was also consistent approximately to the oldest fossil record of ancient firefly having a photophore structure on the abdominal segment, which dated back to around 100 Ma ago (6).

AncLuc1 and AncLuc2 belonged to the last common ancestor of Lampyridae (ConLuc1/2) at 69.59 (95% HPD of 44.57 to 96.19) Ma ago. The duplication between Luc1-type and Luc2-type genes should occur between ConCanth and ConLuc1/2, where the last common ancestral gene encoded AncLamp. The results suggested it was from late-Cretaceous to early-Paleogene periods (Fig. 4).


Origin of beetle bioluminescence

Substantial evidence for luminescent color evolution in fireflies is provided in this study. The luminescence activity in five more-recent ancestral luciferases (AncLamp, AncLuc1, AncLuc2, AncLampn1, and AncLucin1) was comparable to that of extant luciferases, suggesting that the bioluminescence in the common ancestor of Lampyridae already has acquired some biological function(s).

Interpretation of the substantial but less active luminescence activity in AncCanth will be controversial. The most parsimonious reconstruction of the ancestral state, based on molecular phylogenetic analyses, indicated that a common ancestor of the Lampyridae-Phengodidae-Rhagophthalmidae lineage was “bioluminescent” (4). Bioluminescent courtship is observed only in some species within the Lampyridae and Rhagophthalmidae, suggesting that the original function of luminescence in the common ancestor of the Lampyridae-Phengodidae-Rhagophthalmidae lineage was aposematic display from nocturnal predation, and that the mating function in bioluminescence evolved independently in both Lampyridae and Rhagophthalmidae (2). Thus, the substantial but less active bioluminescence of yellow-orange color in AncCanth may represent the ancestral weak luminescence. Aposematic luminescence in eggs and pupae in extant fireflies is very weak compared to the lantern luminescence in adult fireflies, which functions in mating (9, 17). On the other hand, inaccurate prediction of the ancestral sequence in AncCanth might be also the reason for its weaker luminescence activity, because of the relatively higher ambiguity of older sequence information (figs. S2 and S3). Although the main focus of this paper is the bioluminescent state of a common ancestor of Lampyridae (AncLamp), further research is necessary to understand the state of bioluminescence of the common ancestor in the Lampyridae-Phengodidae-Rhagophthalmidae lineage using the recent transcriptome data of Phengodidae (18).

A substantial lack of luminescence activity in the oldest AncElat confirms the hypothesis based on genome analysis (4), suggesting that a common ancestor of the Elateroidea was nonluminescent and that their bioluminescence abilities have evolved in parallel in the Lampyridae-Phengodidae-Rhagophthalmidae lineage and Elateridae (Fig. 1, B and C). The lack of luminescence activity in AncElat can be rationalized by the amino acid sequences. Amino acid sequence comparisons between ancestral/extant luciferases and a related fatty acyl-CoA synthetase highlighted two sites of interest, namely, Gly315 and Ser346. They were the only residues in 12 substrate-binding sites (fig. S1, outlined in blue) consistent with luminous active/inactive features, i.e., conserved within luminescent active or inactive proteins but between them. On the basis of the crystal structure of AncLamp, Gly315 and Ser346 are responsible for the binding to AMP and luciferyl moieties, respectively. In the latter position, polar amino acids (serine or cysteine) are used in all extant luciferases and “luminescent active” ancestral luciferases, and the side chain can interact with the luciferin molecule via hydrogen bonding (fig. S5A). On the other hand, nonpolar leucine is used in luminescent inactive AncElat and AbLL, and the side chain probably cannot contribute to a polar interaction with luciferin (19). This is in congruent to that a single mutation of leucine at this position to serine recovered the substantial luminescence activity in both AbLL and CG6178, a fatty acyl-CoA synthetase in Drosophila (20). In relation to luminescence activity and this residue, it has been reported that the gene product of a luciferase-like gene from the tenebrionid beetle Zophobas morio, Zop, showed substantial luminescence activity with d-luciferin, and that position 329 in Zop (position 346 in AncLamp) was leucine (21). Luminescence is absent in the Leu329Ser mutant of Zop (21). Zop is also an acyl-CoA synthetase, but beetle luciferases are more closely related to AbLL and CG6178 than Zop. This implicates the multiplicity of possible pathways from acyl-CoA synthetase to luciferase.

Trade-off evolution of acyl-CoA synthetase and luciferase

Our previous reports have suggested that firefly luciferases and their orthologs in beetles and Drosophila have ACS activity, and the most preferable substrate was lauric acid (14). On the basis of these results, we have hypothesized that beetle luciferase originated from a fatty acyl-CoA synthetase (presumably, medium-chain fatty acyl-CoA synthetase) (4, 14, 15). More recently, enzymatic promiscuity of beetle luciferase to various carboxylate substrate has been demonstrated, suggesting the evolutionary origin of beetle luciferase from an acyl-CoA synthetase with broad substrate specificity including long-chain fatty acids (15). The results of the present study showed that oldest AncElat exhibited about 10 times higher activity to those of other ancestral luciferases and extant luciferases. This is, in notable contrast, to the luminescence activity profiles and agreement with our “ACS origin” hypothesis of beetle luciferase (Fig. 2, A and C). The presence of ACS activity in AncElat may also be rationalized by the amino acid substitution. It has been reported that fatty acids are also competitive inhibitor for firefly luciferase at higher concentration (22), indicating that both luciferin and fatty acid share the same substrate-binding site. It is expected that the nonpolar leucine at the position 346 in AncElat will favorably accept the aliphatic substrate into the binding site. Consequently, all AbLL mutants having leucine at this position exhibited higher ACS activities than those having serine (19).

This observation also fits the evolutionary “trade-off” model of enzyme neofunctionalization, in which the original catalytic function of an enzyme is decreased during the process of acquiring a new function (23). We previously suggested that the trade-off model fits the evolution of beetle luciferase by site-directed mutagenesis studies of beetle fatty acyl-CoA synthetase (19). We constructed all combinations of the mutants for three amino acids near the active site in AbLL, an ACS of a nonluminous click beetle, and measured both luciferase and ACS activities. The results showed negative trade-offs of these activities among mutants, that is, mutants having higher ACS activity exhibited lower luciferase activity, and vice versa (19). Since the activities of both the luciferase reaction and fatty ACS reaction are the sum of complex kinetics by multiple reaction steps, further studies on biochemical kinetics for each step are necessary to fully understand the evolution of firefly luciferase from acyl-CoA synthetase.

Recent firefly genome analysis showed that firefly luciferase evolved by tandem gene duplications of ACS and subsequent acquisition of the luciferase activity in a redundant gene (4). Probably, the multiple gene duplications of ACS occurred in an ancestral nonbioluminescent beetle leading to the evolution of a new function “luminescence activity” without suffering any adaptive disadvantages in organisms.

Evolution of luminescence color

Luminescence spectra and photography of the ancestral luciferases suggested that the last common ancestor of the Lampyridae-Phengodidae-Rhagophthalmidae lineage emitted yellow (but rather orange by photograph, Fig. 1C) light (AncCanth); the last common ancestor of Lampyridae emitted green luminescence (AncLamp); and after gene duplication, bathochromic shifting (shifting the peak to longer wavelength) occurred in Luc1-type luciferase from green to yellow-green (AncLuc1 and AncLampn1), yellow (AncLucin1), and orange-yellow during the evolution of the lineages in Lampyrinae and Luciolinae, while Luc2-type luciferase conserved the ancestral green light (AncLuc2) in the extant species (Fig. 1B) (10, 11). The bathochromic luminescence of AncCanth fits the hypothesis that beetle “protoluciferases,” having a less developed substrate-binding site environment, emitted red light (24, 25). The present study provides experimental evidence that the first luciferase “prototype” emitted bathochromic yellow-orange color.

Subsequently, the luminescent color of the first luciferase has been subjected to adaptive evolution. Green luminescence in AncLamp fits the current hypotheses of firefly biochemists and ecologists, which assumes the luminescence color of the common ancestor to be green (5, 2628). Green light emission has been observed in the larval stages of the extant diurnal fireflies and also the egg and pupal stages of nocturnal fireflies. Green luminescence is actually common in a number of terrestrial luminous organisms, such as luminous mushrooms and millipedes (29, 30). As fireflies and millipedes are known to have distasteful toxins, they are considered to use luminescence as a warning signal (31, 32). The presence of nocturnal predators maximally sensitive to green light by may have given rise to aposematic displays conferring a selective advantage to fireflies emitting at night (27, 28). The presented results demonstrated that the green luminescent from AncLamp, of the last common ancestor of fireflies ConLuc1/2, already hypsochromically shifted nearly to known limit for extant luciferases. Green luminescence may also have advantage for mate recognition at the early day of Lampyridae, because it can be postulated that “visual spectral sensitivity in nonbioluminescent ancestors of fireflies was conditioned by the predominance of green light reflected from foliage” by comparison of the extant firefly luminescence (5). A fossil male adult firefly in 100 Ma ago amber already had abdominal photophore in which “the location and size of the photic organ resemble those in many modern Lampyridae” (6), implicating the function of sexual communication.

To elucidate the molecular basis of hypsochromic shifted luminescence in AncLamp, the crystal structure was determined and compared with the structure of extant LcLuc1. The substrate-binding site of AncLamp is generally conserved with LcLuc1, but positions 240 and 285 have Ile and Leu substitutions, respectively (Fig. 3C). These substitutions appeared to affect the conformation of conserved Phe246 and eliminate the space for accommodating the luciferyl moiety of d-luciferin from 719 to 558 and 599 Å3. The distance from the Cε1 atom of Phe residue to the thiazole plane of DLSA is 3.43 and 3.18 Å in the LcLuc1 and AncLamp structures, respectively (Fig. 3D).

The previous structural analysis of LcLuc1 revealed that the steric constraint on luciferyl moiety was one of the most significant factors for luminescence wavelength; the Ser286Asn mutant of this protein caused a bathochromic shift in wavelength, and the crystal structure indicted that the mutation induced a conformation change in a substrate-binding residue, Ile288, making a space for thermal relaxation of luciferyl moiety (7). An energy loss through thermal motion of luciferyl moiety was thought to responsible for a low quantum yield and bathochromic shift of luminescence wavelength. Thus, the hypsochromic shifted luminescence in AncLamp can also be explained by the spatial constraint on the luciferyl moiety via largely different atomic interactions as observed in the structure. It suggests that the spatial constraint is naturally occurring and the general strategy for luminescent hypsochromic shift in luciferases.

Hydrogen bonding patterns on DLSA in the crystal structures of AncLamp and LcLuc1 were also compared to assess the possible cause of the hypsochromic shifted luminescence (33). However, the luciferyl moiety of the DLSA formed only few hydrogen bonds through O10 (hydroxyl group of benzothiazole ring), N7 (amine group of thiazole ring; fig. S5A), and O39 (amide group between luciferyl moiety and AMP moiety) atoms, and they were mostly conserved between AncLamp and LcLuc1 (table S4). Only exception was that the last one at the peripheral of the luciferyl moiety was mediated by a water molecule in LcLuc1. This suggested that the hydrogen bonding pattern has been conservative for the core luciferyl moiety, and it would not be a promising cause of the hypsochromic shift of luminescence in this case.

The relationship between luminescence color of firefly luciferase and the amino acid sequence has been extensively studied by site-directed and random mutagenesis, and most of these mutants exhibited bathochromic-shifted luminescence (21, 34). The reported mutations (fig. S1 and table S5) were comprehensively compared with the amino acid differences between ancestral luciferases, and it was found that mutation studies reported by Kajiyama and Nakano (35) and Branchini et al. (34) coincided with our current results in terms of the site position, substitution pattern, and the spectral shift phenotype (Fig. 3, C and D). The position 236 (in AncLamp) was Val in AncElat and Ile in all other ancestral luciferases; thus, the spectral hypsochromic shift in AncElat-AncLamp was consistent to the result of Val239Ile mutant in Luciola cruciata (LcLuc1), exhibiting hypsochromic shift from λmax 562 to 558 nm at pH 7.8 (35). The position 240 was Leu in AncElat, Val in AncLucin1, and Ile in all other ancestral luciferases; thus, the spectral bathochromic shift in AncLamp-AncLucin1 was consistent to the result of Val241Ile mutant in P. pyralis (PpyLuc1), exhibiting hypsochromic shift from λmax 557 to 555 nm at pH 7.8, from 562 to 557 nm at pH 7.0, and from 613 to 609 nm at pH 6.0 (34). Furthermore, this position in extant LcLuc1 was Val; thus, bathochromic shift in AncLamp-LcLuc1 was also consistent. These two positions were exactly located behind Phe246, which faces to luciferin moiety in the crystal structure of AncLamp (Fig. 3D), demonstrating the involvement in evolution of luminescence color.

On the basis of the luminescence colors in AncLuc1, AncLuc2, AncLampn1, and AncLucin1, we proposed an evolutionary diversification scenario of bioluminescence in Lampyridae. After gene duplication event at about 70 Ma ago (Fig. 4), spatio-temporal subfunctionalization has happened; Luc1-type luciferase is used for the luminescence of lanterns in larval, pupal, and adult stages, while Luc2-type luciferase is used for the glow of eggs and pupal body (9, 11, 17). Green luminescence in AncLuc1 and AncLuc2 suggest that both Luc1 and Luc2 initially emitted in the same ancestral green spectrum, as also seen in the extant Pyrocoelia atripennis (9). In this regard, Luc2-type luciferases should be the authentic orthologs of AncLamp. After that, in both lineages of Lampyrinae and Luciolinae, some species independently began to adapt to a twilight environment for mating behavior. Now, the crepuscular fireflies needed to differentiate mate luminescence from the background noise of green foliage wherein lantern emissions became more bathochromic shifted toward yellow (AncLampn1 and AncLucin1) as seen in extant P. pyralis and L. parvula (5). The spectral sensitivities of crepuscular fireflies match their own yellowish light spectra to receive the conspecific luminescence effectively (5, 36). In contrast to the colorful evolution of Luc1-type luciferase, extant Luc2-type luciferase conserved the original green luminescence of AncLuc2 for warning displays in egg and pupal stages in both diurnal and nocturnal fireflies, because the gene duplication between Luc1 type and Luc2 type enabled a role differentiation between the paralogs.

In conclusion, we predict that the most ancestral fireflies emitted green light for aposematic display purposes, and some species evolved to use the light as a secondary role for complex sexual communications, shifting to a more yellowish light (Fig. 5). We believe that our results demonstrated the practical usefulness of the ancestral gene reconstruction methods (12), successfully reenacting visual scenarios of the lost world.

Fig. 5 Scheme of the evolution of firefly’s bioluminescence.

Horizontal and vertical axes represent the approximate age and luminescence activity, respectively. Z axis shows the color of ancestral luciferases.


Ancestral luciferase sequence prediction

The initial set of amino acid sequences was retrieved from GenBank, Refseq, and UniProt databases. The sequences were aligned by using MAFFT v7.309 and manually refined with the XCED v8a.3.93 program (37). The final alignment contained a total of 52 sequences (Fig. 1B and table S3). The topology of phylogenetic tree was inferred on the basis of the amino acid sequences of the extant proteins with the neighbor-joining method on the Jones-Taylor-Thornton (JTT) matrix (38, 39). Then, the tree topology was manually refined so that it was coherent to the currently accepted species phylogeny by referring to the literatures (2, 16).

The phylogenetic tree and the alignment were applied to PAML v4.5 to infer the ancestral sequences (40). The empirical model with the JTT matrix was used for the substitution model. No site partition was defined, and substitution rate was uniform over the sites (41). The ancestral sequences were verified by reconstructing molecular phylogenies to determine whether the inferred sequences were connected to the corresponding nodes with zero evolutionary distance. The average posterior probabilities over the amino acid sites for each ancestor are shown in Table 1. The most probable amino acid sequences are aligned in fig. S1. The distributions of site posterior probabilities for each ancestral luciferase are shown in fig. S2. The numbers and distribution of less certainly identified residues, for which the posterior probability is less than 0.1 higher from that of the second most probable amino acid, are shown in fig. S3.

Ancestral luciferase geological age estimation

The geological ages of the ancestral luciferases were estimated with the Bayesian MCMC method using BEAST v1.8.4. A total of 63 18S rRNA gene sequences of Elateriformia species were aligned by using MAFFT v7.309, and the alignment was manually corrected. A total of 665 aligned sites were selected by avoiding the sites on or near the undetermined or gapped sites. The alignment and the phylogeny based on that of luciferases were submitted for a MCMC calculation. The general time reversible (GTR) substitution model with gamma categories of 4 was adapted. The fossil date 152 Ma ago (standard deviation, 10 Ma ago) of the most recent common ancestor (TMRCA) of Elateroidea (16) was used for the reference, and the dates of the nodes were estimated on the uncorrelated relaxed clock model. A total of 108 MCMC iterations were performed, and the 95% HPD interval was obtained excluding the first 107 iterations as a burn-in process. The trajectories were analyzed with TRACER v1.6.0 and FIGTREE v1.4.2. The results of ancestral node dating using the Bayesian MCMC method demonstrated that the date of the reference node (Elateroidea common ancestor) has adequately converged and distributed normally around the expected date with mean of 150.7 Ma ago and 95% HPD from 131.15 to 170.46 Ma ago based on 87,588 effective sample size. The estimated geological ages of ancestral luciferases are shown in Table 1.

Expression and purification of recombinant proteins

Full ORF of the ancestral sequences were synthesized (Fasmac, Kanagawa, Japan) with E. coli optimized codon use (GenBank accession numbers, LC534642LC534648) and ligated in frame into the expression vector pCold-ZZ-P-X (42). The vectors were subcloned into E. coli BL21 (DE3) pLysS (Promega, Madison, WI, USA), and the recombinant proteins were expressed by cold shock at 15°C for 30 min with 0.2 mM of isopropyl-β-d-thiogalactopyranoside. The harvested cells were disrupted by sonication (4°C, 50 W, 5 s) using a Sonicator (Ohtake Works, Tokyo, Japan) three times, and the supernatant was adsorbed on a Talon cobalt-chelating column (Takara Bio, Shiga, Japan). To remove the fused ZZ domain and hexa histidine tag, the column was washed using a wash buffer [50 mM tris-HCl (pH 7.0), 1 M NaCl, and 15 mM imidazole], and then, on-column digestion reaction using PreScission protease (GE Healthcare, Piscataway, NJ, USA) was conducted overnight at 4°C for 12 hours, and the protein was eluted using an elution buffer [50 mM tris-HCl (pH 7.0), 150 mM NaCl, 1 mM EDTA, and 1 mM dithiothreitol]. Protein concentration was measured using Bio-Rad Protein Assay Dye Reagent (Bio-Rad, Hercules, CA, USA) with bovine serum albumin Fraction V (Sigma-Aldrich, St. Louis, MO, USA) as a standard. The homogeneity and concentration estimated by protein assay were confirmed by SDS–polyacrylamide gel electrophoresis using 10% separation gel with Coomassie Brilliant Blue (CBB) staining (fig. S4). We tested the stability of each recombinant protein by leaving the solution on ice and confirmed that the luminescence activity had not decreased in the order of total light intensities for all proteins even after 12 hours (table S6). This suggests that enzymatic inactivation of the ancestral proteins during purification process was negligible. The purified protein was stored at −80°C until use.

Measurement of the luminescence spectra

Luminescence spectra were measured using a spectrophotometer AB-1850 (ATTO, Tokyo, Japan). Spectral sensitivity was calibrated. A 2.5 or 5 μg of purified recombinant luciferase was mixed with 500 μM d-luciferin, 4 mM ATP, and 8 mM MgCl2 in 0.1 M tris-HCl buffer (pH 8.0). Exposure time was 1 to 2 min. P. pyralis luciferase (PpyLuc1) was purchased from Sigma-Aldrich. Because of its very low luminescence activity, 10 μg of the protein was used for measuring the spectrum of AncElat.

Measurement of the luminescence intensities

Luminescence intensity was measured using a luminometer CLX-101 (TOYOBO). Luminescence reaction was initiated by injecting the 50 μl of the purified recombinant luciferase (430 ng) into 50 μl of the mixture of d-luciferin (final concentration, 50 or 200 μM), ATP (final concentration, 100 μM), and MgCl2 (final concentration, 5 mM) in tris-HCl buffer (pH 7.8) (final concentration, 50 mM). Light intensity was integrated from 2 to 32 s after mixing (total of 30 s).

Measurement of the ACS activities

Thioesterification activity was determined using ultraviolet-visible spectrophotometry (Eppendorf, BioSpectrometer kinetic). In this assay, we measured the initial rate of AMP formation by coupling the thioesterification reaction with adenylate kinase, pyruvate kinase, and lactate dehydrogenase and monitoring the oxidation of reduced form of nicotinamide adenine dinucleotide (NADH) at 340 nm (6220 M−1 cm−1) (43). The standard reaction mixture for this assay contained 100 mM tris-HCl (pH 8.0), 10 mM ATP, 10 mM MgCl2, 0.35 mM lauric acid (C12:0) substrate, 2 mM CoASH, 1 mM phosphoenolpyruvic acid, 0.4 mM NADH, adenylate kinase (40 μg/ml), pyruvate kinase (20 μg/ml), lactate dehydrogenase (20 μg/ml), and 10 μg of purified recombinant luciferase. The total volume was brought to 500 μl. The mixture containing all components except for the luciferase was incubated at room temperature (27 ± 2°C) for 10 min. The reaction was then initiated by addition of the enzyme and the data collected at 5 s intervals for 10 min.

Crystal structure analyses

The structures of AncLamp were determined by x-ray crystallography. The AncLamp crystals were grown by the hanging drop vapor diffusion method, under conditions using 0.1 M trisodium citrate buffer (pH 5.5) containing 20% (w/v) PEG3000 (polyethylene glycol, molecular weight 3000) as a 0.5-ml reservoir, and a mixture of 2 μl of reservoir solution and 2 μl of protein solution in 50 mM tris-HCl (pH 8.0) buffer containing 1% (w/v) AncLamp in the hanging drop. To cocrystallize the protein with reaction-intermediate analog or substrates, 0.2 mM DLSA or 0.2 mM ATP and 0.2 mM d-luciferin were added to the protein solutions. All crystals were grown at 18°C for a few weeks.

X-ray diffraction data were collected from loop-mounted crystals under cryogenic conditions, with a charge-coupled device detector Quantum315 (ADSC) at BL38B1 or Eiger4M at BL26B2 in SPring-8 (Hyogo, Japan). The crystals were soaked for 10 to 30 s in the corresponding crystal growth buffer containing 15% (v/v) 2-methyl-2,4-pentanediol for cryoprotection. The diffraction images were processed with the MOSFLM program (44).

The crystal structures were solved by the molecular replacement method using the Phaser-MR application of PHENIX (45). The crystal structure of L. cruciata LcLuc1 (PDB code 2D1S) was used for a search model. The model refinements were conducted by using COOT and the phenix.refine application of PHENIX (44, 46). The quality of the models was evaluated with the PROCHECK program (47). The crystallographic parameters, data collection and refinement statistics, and PDB codes are summarized in table S2. The atomic coordinates and structure factors of AncLamp-DLSA and AncLamp–d-luciferin–ATP complexes have been deposited in the PDB, with the accession codes 6K4C and 6K4D, respectively. The molecular graphics were prepared with CHIMERA (48).

The cavity volume of substrate-binding site was evaluated by using CASTp application (49). To close the hole for substrates, the AMP moiety of the DLSA or luciferyl-AMP was used for dummy atoms, and the volume for luciferyl moiety was evaluated.


Supplementary material for this article is available at

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank V. B. Meyer-Rochow (Andong National University, Korea and Oulu University, Finland) and J. C. Day (Centre for Ecology and Hydrology, NERC, UK) for critical comments and discussions and S. Kanie (Nagoya University, Japan) for providing the synthetic DLSA. We are grateful to the Japan Synchrotron Radiation Research Institute (JASRI) for support at the synchrotron facilities at SPring-8. Funding: This work was partly supported by Grants-in-Aid for Scientific Research from the Ministry of Education, Culture, Sports, Science and Technology-Japan (JP17H01818) and Platform Project for Supporting Drug Discovery and Life Science Research [Basis of Supporting Innovative Drug Discovery and Life Science Research (BINDS)] from AMED (JP19am0101111 support number 2067) to T.S. and JST CREST (JPMJCR16N1) and Ministry of Education, Culture, Sports, Science and Technology-Japan (JP20H03002) to Y.O. Author contributions: Y.O., D.K., and T.S. conceived and designed the study. K.K., D.Y., H.S., D.K., and T.S. carried out the molecular and biochemical experiments. T.S. performed the ancestral sequence analyses and crystallography. Y.O., D.K., and T.S. wrote the manuscript and edited the manuscript based on discussions with all authors. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper, the Supplementary Materials, DDBJ, and PDB. Additional data related to this paper may be requested from the authors.

Stay Connected to Science Advances

Navigate This Article