Any student who has lost hours of work to a computer crash knows the value of backing up important files. Yet long before the first distraught student uttered shrieks of dismay at disappearing data, plants were saving an extra copy of certain genes—or so say Brad Chapman and his colleagues in a recent paper that offers a fresh look at what happens to duplicated genes when polyploids are formed.

A polyploid is the result of genome duplication (Bowers et al. 2003), which may occur when errors during meiosis produce aberrant gametes with 2N rather than 1N chromosomes. Although genome duplication rarely occurs in animal species, many plants are polyploid—banana and sugarcane both carry multiple copies of their chromosomes (Chapman et al. 2006). But not all species whose progenitors experienced genome duplication actually have twice the normal number of chromosomes. Over time, duplicate genes follow one of two fates (Bowers et al. 2006): the duplicate may be retained, leaving an extra copy which Chapman et al. (2006) dub a ‘paleolog.;’ or the extra copy may be deleted, leaving a ‘singleton,’ a gene which lacks a paleologous copy.

These extra copies of genes have long been viewed as prime real estate for the evolution of new gene functions. The classical theory suggests that as long as one paleologous copy remains intact to perform its normal function, mutations can persist in the other copy, possibly leading it to perform new roles in the organism. This is called functional divergence (Chapman et al. 2006). Alternatively, mutations may allow the two copies to divide the original task between them, carrying out a process called subfunctionalization (Tocchini-Valentini et al. 2005).

Either way, genes retained as duplicates are expected to show more severe changes to amino acid sequence than genes for which only a single copy is kept. This central prediction of the classical model has been demonstrated in several studies (Chapman et al. 2006). Nevertheless, the classical model based on functional divergence and subfunctionalization is not the whole story of duplicated genes. Other evidence points in a different direction, suggesting a new, emerging model based on “functional compensation” or “functional buffering.” Duplicate genes from organisms as diverse as Xenopus, Arabidopsis and Saccaromyces actually show unexpected similarity even long after the original duplication event. In fact, it appears that they may even undergo processes that maintain the similarity between them.

Which model is more appropriate? The classical model asserts that duplicates will change while singletons are conserved. The emerging functional buffering model proposes the opposite. It predicts that paleologous duplicates will undergo fewer changes than singleton genes do in their nucleotide and amino acid sequences. In this battle of the models, Chapman et al. (2006) set out to address intriguing, unanswered questions of molecular evolution: how is the evolution of a gene sequence affected by the presence of an extra copy of that gene from a genome duplication event? And does the effect of a paleologous copy on gene evolution depend on the type of gene in question?

Chapman et al. (2006) take advantage of information from genome sequencing of Arabidopsis and rice to explore the nature of singleton and duplicate genes from the same genome duplication event. While it seems that the rice lineage went through only one such duplication, Arabidopsis bears evidence of three (α, β and γ, where α is the most recent). By following the fate of Arabidopsis genes over each of these duplications, Chapman et al. (2006) were able to investigate whether the any given gene follows the same fate each time the genome is duplicated.

The resulting data show striking support for the hypothesis that particular types of genes are preferentially retained in duplicate. Over the course of three Arabidopsis duplications, genes reduced back to single status after the γ duplication are usually reduced to singletons again after the α and β events as well. Likewise, genes duplicated in the γ event usually show up as duplicates following the α and β events.

Not only are some genes preferentially retained as duplicates, but the differences between these duplicates are fewer and less severe than between homologous singleton genes in related ecotypes or subspecies. For paleologs, single-nucleotide polymorphisms (single base-pair differences in otherwise identical stretches of DNA) tend to fall in the third codon position, where they are unlikely to alter the amino acid in the functional protein, whereas mutations retained by singleton genes usually do alter the protein product. This stands in stark contrast with the classical prediction that duplicate genes rather than singletons will show more drastic (i.e., protein-altering) changes in their DNA sequences.

This alone lends credibility to the emerging functional buffering model, but Chapman et al. (2006) take the analysis one step further by investigating the characteristics of those genes that tend to be retained as duplicates. They found that paleologs tend to be longer than singletons, and discernable domains—the regions of genes responsible for particular functions such as DNA binding—make up a larger portion of genes retained in duplicate.

When a second copy is present at a paleologous locus, domains of these genes also show fewer mutations than domains in singletons, whereas non-domains show similar mutation rates for both singletons and paleologs. This suggests that some mechanism may be at work to maintain similarity in the key functional regions of paleologs (Chapman et al. 2006). Hence, it does appear that functional buffering describes the pattern of change in duplicate genes.

Although Chapman et al. (2006) provide further empirical evidence for the as-of-yet less recognized functional buffering model; the classical model still has strong evidence. In light of the various data, Chapman et al. (2006) synthesize a union of the models. Perhaps, they say, functional buffering and functional divergence are points along a spectrum of evolutionary possibilities for genes retained in duplicate. Some classes of genes—and only some classes—may offer greater fitness when kept as similar as possible. Other studies have found that these classes include transcription factors, signal transducers and developmental genes (Maere et al. 2005). In other classes of genes, duplicates may diverge, even to the point where they are no longer recognizable as paleologs (Chapman et al. 2006). In the first case, the fitness advantage lies in having a backup copy to do the job if the other copy is damaged. In the second case, the fitness advantage lies in the opportunity to acquire a new function (Chapman et al. 2006).

Chapman et al. (2006) go on to expand the model with a radical suggestion: perhaps successful polyploids are more likely to form when this buffering has eventually been eroded to the extent that the two copies can no longer do the same job. Evidence that this may be possible is found within their own data. Paleologs in Arabidopsis and rice are not identical—they simply show fewer DNA base pair changes than singleton genes do. In fact, paleologs from the first two genome duplications in Arabidopsis statistically show the same degree of change as singleton genes from the same duplication event, whereas the duplicates from the most recent duplication still statistically show less drastic change. Perhaps as the buffering capacities of the paleologs break down over time, new polyploids (which are often less fit) may have higher relative fitness than their progenitors because they have regained the functional buffering lost since the previous genome duplication (Chapman et al. 2006).

Several questions remain for further research. Chapman et al. (2006) point out that their results focus on protein-encoding DNA, not regulatory regions. What happens to the double copies of non-coding regulatory sequences after genome duplication? Blanc and Wolfe (2004) found that duplicates from the most recent polyploidization in the Arabidopsis lineage are not transcribed at the same level. We could further our understanding of both molecular evolution and gene regulation by examining what causes this difference in expression of paleologous genes.

Future research could also delve farther into the types of genes preferentially retained as duplicates. Chapman et al. (2006) showed that genes with recognizable paleologs tend to be longer and more complex than genes reduced back to single status after the genome duplication. But are most long, complex genes retained in duplicate, or is it simply that most paleologs are long and complex?

Another important question rises out of Chapman et al. ’s (2006) suggestion that there is a selective advantage to retaining a paleologous copy of certain genes. Yet the duplicate may not necessarily confer higher fitness per se. Perhaps the types of genes that tend to be retained as highly similar duplicates are simply those for which the cell cannot tolerate aberrant forms even if the normal type is still available at another locus.

These questions have yet to be answered, but it remains that Chapman et al. (2006) offer an excellent contribution to our understanding of molecular evolution by demonstrating the effect of a paleologous copy on gene evolution and by proposing a united model for how evolution will proceed in different classes of genes. As we understand more about molecular evolution following genome duplications, we can further probe Chapman et al. ’s (2006) most novel suggestion. Chapman et al. (2006) suggest that the breakdown of functional buffering creates a situation in which polyploids are more likely to survive as successful lineages. Given that new domestic and natural species have arisen through polyploidization (Raven et al. 1992), understanding how functional buffering affects the success of new polyploids bring us closer to resolving the most burning issue of evolutionary biology—the mechanisms of speciation.


Blanc, G and KH. Wolfe. 2004. Functional Divergence of Duplicated Genes Formed by Polyploidy during Arabidopsis Evolution. Plant Cell 16: 1679-1691.

Bowers, JE, BA Chapman, J Rong an AH Paterson. 2003. Unraveling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422: 433-438.

Chapman, BA, JE Bowers, FA Feltus and AH Paterson. 2006. Buffering of crucial functions by paleologous duplicated genes may contribute cyclicality to angiosperm genome duplication. PNAS 103(8): 2730-2735.

Maere, S, S De Bodt, J Raes, Tineke Casneuf, M Van Montagu, M Kuiper, and Y Van de Peer. 2005. Modeling gene and genome duplications in eukaryotes. PNAS 102: 5454-5459.

Raven, PH, RF Evert and SE Eichhorn. 1992. Biology of Plants, fifth edition. Worth Publishers: New York.

Tocchini-Valentini, G. D., P Fruscoloni and GP Tocchini-Valentini. 2005. Structure, function, and evolution of the tRNA endonucleases of Archaea: An example of subfunctionalization. PNAS 102: 8933–8938.