Is copy number variation dynamic?

Is copy number variation dynamic?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Is there any evidence showing that copy number variation changes over time? I'm wanting to model interactions in expression level as a dynamic bayesian network, but an assumption my approach will need to make is that it is static.

Your question could be phrased more specifically to avoid ambiguity, but rephrasing it the way that I suspect you mean it, ("Is there any evidence showing that [the rate of] copy number variation changes over time?"), then yes, there is indeed.

The rate depends on many factors including which mechanism and which organism and which region of the organism's genome and as @Michael wrote, also what scale (et. al.) is under consideration too.

So your assumption that the rate is static should probably be stated explicitly.

Nature Reviews Genetics 10, 551-564 (August 2009) | doi:10.1038/nrg2593

Highly dynamic temporal changes of TSPY gene copy number in aging bulls

The Y-chromosomal TSPY gene is one of the highest copy number mammalian protein coding gene and represents a unique biological model to study various aspects of genomic copy number variations. This study investigated the age-related copy number variability of the bovine TSPY gene, a new and unstudied aspect of the biology of TSPY that has been shown to vary among cattle breeds, individual bulls and somatic tissues. The subjects of this prospective 30-month long study were 25 Holstein bulls, sampled every six months. Real-time quantitative PCR was used to determine the relative TSPY copy number (rTSPY CN) and telomere length in the DNA samples extracted from blood. Twenty bulls showed an altered rTSPY CN after 30 months, although only 9 bulls showed a significant change (4 significant increase while 5 significant decrease, P<0.01). The sequential sampling provided the flow of rTSPY CN over six observations in 30 months and wide-spread variation of rTSPY CN was detected. Although a clear trend of the direction of change was not identifiable, the highly dynamic changes of individual rTSPY CN in aging bulls were observed here for the first time. In summary we have observed a highly variable rTSPY CN in bulls over a short period of time. Our results suggest the importance of further long term studies of the dynamics of rTSPY CN variablility.

Conflict of interest statement

Competing Interests: The commercial affiliation does not alter our adherence to PLOS ONE policies on sharing data and materials. TK participated in providing samples. The commercial affiliation has no competing interest in any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products.


Fig 1. Pie chart representation of change…

Fig 1. Pie chart representation of change in TSPY copy number in the 25 bulls…

Fig 2. Relative TSPY copy number change…

Fig 2. Relative TSPY copy number change over the 30 months sampling period (rTSPY CN…

Fig 3. rTSPY CN measured by qPCR…

Fig 3. rTSPY CN measured by qPCR at every six months (T0-T30) during the 30…

Spectrum of large copy number variations in 26 diverse Indian populations: potential involvement in phenotypic diversity

Copy number variations (CNVs) have provided a dynamic aspect to the apparently static human genome. We have analyzed CNVs larger than 100 kb in 477 healthy individuals from 26 diverse Indian populations of different linguistic, ethnic and geographic backgrounds. These CNVRs were identified using the Affymetrix 50K Xba 240 Array. We observed 1,425 and 1,337 CNVRs in the deletion and amplification sets, respectively, after pooling data from all the populations. More than 50% of the genes encompassed entirely in CNVs had both deletions and amplifications. There was wide variability across populations not only with respect to CNV extent (ranging from 0.04-1.14% of genome under deletion and 0.11-0.86% under amplification) but also in terms of functional enrichments of processes like keratinization, serine proteases and their inhibitors, cadherins, homeobox, olfactory receptors etc. These did not correlate with linguistic, ethnic, geographic backgrounds and size of populations. Certain processes were near exclusive to deletion (serine proteases, keratinization, olfactory receptors, GPCRs) or duplication (homeobox, serine protease inhibitors, embryonic limb morphogenesis) datasets. Populations having same enriched processes were observed to contain genes from different genomic loci. Comparison of polymorphic CNVRs (5% or more) with those cataloged in Database of Genomic Variants revealed that 78% (2473) of the genes in CNVRs in Indian populations are novel. Validation of CNVs using Sequenom MassARRAY revealed extensive heterogeneity in CNV boundaries. Exploration of CNV profiles in such diverse populations would provide a widely valuable resource for understanding diversity in phenotypes and disease.

Is copy number variation dynamic? - Biology

A curated catalogue of structural variation in the human genome

  • How much copy number variation (CNV) exists between human genomes?
  • How best can CNVs be incorporated into whole genome association studies?
  • What is the contribution of copy number variation to genetic disease?
  • What is the relative contribution of different mutational mechanisms to CNV?
  • What is the genomic impact of CNV on gene expression?
  • What role has copy number variation played in recent human evolution?
  • Increasing medical and scientific knowledge about chromosomal microdeletions/duplications
  • Improving medical care and genetic advice for individuals/families with submicroscopic chromosomal imbalance
  • Facilitating research into the study of genes which affect human development and health

Large-scale copy number variants (CNVs): Distribution in normal subjects and FISH/real-time qPCR analysis.
Ying Qiao, Xudong Liu, Chansonette Harvard, Sarah L Nolin, W Ted Brown, Maryam Koochek, Jeanette JA Holden, ME Suzanne Lewis, and Evica Rajcan-Separovic
BMC Genomics 2007, 8: 167-177

Global variation in copy number in the human genome.
Richard Redon, Shumpei Ishikawa, Karen R. Fitch, Lars Feuk, George H. Perry, T. Daniel Andrews, Heike Fiegler, Michael H. Shapero, Andrew R. Carson, Wenwei Chen4, Eun Kyung Cho, Stephanie Dallaire, Jennifer L. Freeman, Juan R. Gonzalez, Monica Gratacos, Jing Huang, Dimitrios Kalaitzopoulos, Daisuke Komura, Jeffrey R. MacDonald, Christian R. Marshall, Rui Mei, Lyndal Montgomery, Kunihiro Nishimura, Kohji Okamura, Fan Shen, Martin J. Somerville, Joelle Tchinda, Armand Valsesia, Cara Woodwark, Fengtang Yang, Junjun Zhang, Tatiana Zerjal, Jane Zhang, Lluis Armengol, Donald F. Conrad, Xavier Estivill, Chris Tyler-Smith, Nigel P. Carter, Hiroyuki Aburatani, Charles Lee, Keith W. Jones, Stephen W. Scherer & Matthew E. Hurles
Nature (2006) Vol 444. 444-454

Copy number variation and evolution in humans and chimpanzees.
Perry GH, Yang F, Marques-Bonet T, Murphy C, Fitzgerald T, Lee AS, Hyland C, Stone AC, Hurles ME, Tyler-Smith C, Eichler EE, Carter NP, Lee C, Redon R.
Genome Res. 2008 18(11): 1698-1710

Simultaneous mutation and copy number variation (CNV) detection by multiplex PCR-based GS-FLX sequencing.
Goossens D, Moens LN, Nelis E, Lenaerts AS, Glassee W, Kalbe A, Frey B, Kopal G, De Jonghe P, De Rijk P, Del-Favero J.
Hum Mutat. 2009 30(3): 472-476

Genome-wide analysis of transcript isoform variation in humans.
Kwan T, Benovoy D, Dias C, Gurd S, Provencher C, Beaulieu P, Hudson TJ, Sladek R, Majewski J.
Nat Genet. 2008 40(2): 225-231.

Transcript copy number estimation using a mouse whole-genome oligonucleotide microarray.
Mark G Carter, Alexei A Sharov, Vincent VanBuren, Dawood B Dudekula, Condie E Carmack, Charlie Nelson and Minoru S H Ko
Genome Biology 2005, 6:R61

Genome-wide copy-number-variation study identified a susceptibility gene, UGT2B17, for osteoporosis.
Yang TL, Chen XD, Guo Y, Lei SF, Wang JT, Zhou Q, Pan F, Chen Y, Zhang ZX, Dong SS, Xu XH, Yan H, Liu X, Qiu C, Zhu XZ, Chen T, Li M, Zhang H, Zhang L, Drees BM, Hamilton JJ, Papasian CJ, Recker RR, Song XP, Cheng J, Deng HW.
Am J Hum Genet. 2008 83(6): 663-674

Copy-number variation genotyping of GSTT1 and GSTM1 gene deletions by real-time PCR.
Rose-Zerilli MJ, Barton SJ, Henderson AJ, Shaheen SO, Holloway JW.
Clin Chem. 2009 55(9): 1680-1685

Statistical tools for transgene copy number estimation based on real-time PCR.
Joshua S Yuan, Jason Burris, Nathan R Stewart, Ayalew Mentewab and C Neal Stewart
BMC Bioinformatics 2007, 8(): S6

Copy number variation goes clinical.
A meeting report
Le Caignec C, Redon R.
Genome Biol. 200910(1): 301-303

Copy number variation: New insights in genome diversity.
Jennifer L. Freeman, George H. Perry, Lars Feuk, Richard Redon, Steven A. McCarroll, David M. Altshuler, Hiroyuki Aburatani, Keith W. Jones, Chris Tyler-Smith, Matthew E. Hurles, Nigel P. Carter, Stephen W. Scherer, and Charles Lee
Genome Research (2006) 16:949�

Real-Time Quantitative PCR as an Alternative to Southern Blot or Fluorescence In Situ Hybridization for Detection of Gene Copy Number Changes.
Jasmien Hoebeeck, Frank Speleman, and Jo Vandesompele
Methods in Molecular Biology, vol. 353: 205-226
Protocols for Nucleic Acid Analysis by Nonradioactive Probes, Second Edition Edited by: E. Hilario and J. Mackay

An accurate method for quantifying and analyzing copy number variation in porcine KIT by an oligonucleotide ligation assay.
Bo-Young Seo, Eung-Woo Park, Sung-Jin Ahn, Sang-Ho, Jae-Hwan Kim, Hyun-Tae Im, Jun-Heon Lee, In-Cheol Cho, Il-Keun Kong and Jin-Tae Jeon
BMC Genetics (2007) 8:81

Large-Scale Copy Number Polymorphism in the Human Genome.
Jonathan Sebat, B. Lakshmi, Jennifer Troge, Joan Alexander, Janet Young, Par Lundin, Susanne Maner, Hillary Massa, Megan Walker, Maoyen Chi, Nicholas Navin, Robert Lucito, John Healy, James Hicks, Kenny Ye, Andrew Reiner, T. Conrad Gilliam, Barbara Trask, Nick Patterson, Anders Zetterberg, Michael Wigler
SCIENCE (2004) VOL 305 525-528

Accurate and reliable high-throughput detection of copy number variation in the human genome.
Heike Fiegler, Richard Redon, Dan Andrews, Carol Scott, Robert Andrews, Carol Carder, Richard Clark, Oliver Dovey, Peter Ellis, Lars Feuk, Lisa French, Paul Hunt,1 Dimitrios Kalaitzopoulos, James Larkin, Lyndal Montgomery, George H. Perry, Bob W. Plumb, Keith Porter, Rachel E. Rigby, Diane Rigler, Armand Valsesia, Cordelia Langford, Sean J. Humphray, Stephen W. Scherer, Charles Lee, Matthew E. Hurles, and Nigel P. Carter
Genome Research (2006) 16:1566�

Detection of large-scale variation in the human genome.
A John Iafrate, Lars Feuk, Miguel N Rivera, Marc L Listewnik, Patricia K Donahoe, Ying Qi, Stephen W Scherer & Charles Lee

Genome assembly comparison identifies structural variants in the human genome.
Razi Khaja, Junjun Zhang, Jeffrey R MacDonald, Yongshu He, Ann M Joseph-George, John Wei, Muhammad A Rafiq, Cheng Qian, Mary Shago, Lorena Pantano, Hiroyuki Aburatani, Keith Jones, Richard Redon, Matthew Hurles, Lluis Armengol, Xavier Estivill, Richard J Mural, Charles Lee, Stephen W Scherer & Lars Feuk
NATURE GENETICS (2006) VOLUME 38 NUMBER 12 1413-1418

Stochastic mRNA Synthesis in Mammalian Cells.
Arjun Raj, Charles S. Peskin, Daniel Tranchina, Diana Y. Vargas, Sanjay Tyagi
PLOS (2006) Volume 4 Issue 10 e309

Discussion and conclusions

Our results establish a scalable statistical framework for assigning cells measured using scRNA-seq to cancer clones measured independently using shallow scDNA-seq. We expect this approach can be used ubiquitously in the field of single-cell biology including extensions for other multi-modal approaches such as methylation-transcription and chromatin accessibility-transcription.

However, there are certain situations in which clonealign cannot be applied. While it is estimated that 60–80% of cancers exhibit the complex structural genomic rearrangements required to apply clonealign [26, 27], some cancers have quiescent genomes and are devoid of copy number changes. For example, cancers such as karyotypically normal AML, sarcomas, and other pediatric malignancies without genomic instability would not generate the genomic/transcriptomic signals modeled by clonealign [28].

Furthermore, the focus of this work has been on linking transcriptional measurements to genomically defined clones assuming only a copy-number dosage effect on transcript abundance. While the clonealign model allows for integration of allelic imbalance information caused by clone-specific LOH events, the sparse expression of germline heterozygous variants detected by the 10X chromium 3 ′ assay demonstrated here makes such information uninformative (Additional file 2: Supplementary text section 3). However, full-transcript-length single-cell RNA sequencing technologies such as Smart-seq2 [29] would allow for further refinement of clonal assignment and represent the appropriate use-case of clonealign ’s incorporation of allelic imbalance information.

However, the concepts introduced in the clonealign model provide a basis for future studies of the integration of genomic data from independently sampled assays. At the edge of the field, sparse in situ measurements of transcription integrated with independent disaggregated sampling of single-cell genomes are providing a route to studying spatial context of co-located cell populations [30]. Finally, there is an emergence of commercial platforms whereby single-cell, kit-based assays for methylation, transcription, and genome copy number are becoming widely available to the research community. In all of these settings, clonealign and future derivatives will provide a statistical framework to help interpret the cellular constituents of cancer, their fitness, and their phenotypes.


A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington's disease chromosomes. The Huntington's Disease Collaborative Research Group. Cell. 1993, 72: 971-983. 10.1016/0092-8674(93)90585-E.

Verkerk AJ, Pieretti M, Sutcliffe JS, Fu YH, Kuhl DP, Pizzuti A, Reiner O, Richards S, Victoria MF, Zhang FP, et al: Identification of a gene (FMR-1) containing a CGG repeat coincident with a breakpoint cluster region exhibiting length variation in fragile X syndrome. Cell. 1991, 65: 905-914. 10.1016/0092-8674(91)90397-H.

Hui J, Stangl K, Lane WS, Bindereif A: HnRNP L stimulates splicing of the eNOS gene by binding to variable-length CA repeats. Nat Struct Biol. 2003, 10: 33-37. 10.1038/nsb875.

Gebhardt F, Zanker KS, Brandt B: Modulation of epidermal growth factor receptor gene transcription by a polymorphic dinucleotide repeat in intron 1. J Biol Chem. 1999, 274: 13176-13180. 10.1074/jbc.274.19.13176.

Jeffreys AJ, Royle NJ, Wilson V, Wong Z: Spontaneous mutation rates to new length alleles at tandem-repetitive hypervariable loci in human DNA. Nature. 1988, 332: 278-281. 10.1038/332278a0.

Jakupciak JP, Wells RD: Genetic instabilities in (CTG.CAG) repeats occur by recombination. J Biol Chem. 1999, 274: 23468-23479. 10.1074/jbc.274.33.23468.

Richard GF, Dujon B, Haber JE: Double-strand break repair can lead to high frequencies of deletions within short CAG/CTG trinucleotide repeats. Mol Gen Genet. 1999, 261: 871-882. 10.1007/s004380050031.

La Spada AR, Wilson EM, Lubahn DB, Harding AE, Fischbeck KH: Androgen receptor gene mutations in X-linked spinal and bulbar muscular atrophy. Nature. 1991, 352: 77-79. 10.1038/352077a0.

Sutherland GR, Richards RI: Simple tandem DNA repeats and human genetic disease. Proc Natl Acad Sci USA. 1995, 92: 3636-3641.

Kenny D, Muckian C, Fitzgerald DJ, Cannon CP, Shields DC: Platelet glycoprotein Ib alpha receptor polymorphisms and recurrent ischaemic events in acute coronary syndrome patients. J Thromb Thrombolysis. 2002, 13: 13-19. 10.1023/A:1015307823578.

Holmer SR, Hengstenberg C, Kraft HG, Mayer B, Poll M, Kurzinger S, Fischer M, Lowel H, Klein G, Riegger GA, Schunkert H: Association of polymorphisms of the apolipoprotein(a) gene with lipoprotein(a) levels and myocardial infarction. Circulation. 2003, 107: 696-701. 10.1161/01.CIR.0000048125.79640.77.

Bugert P, Hoffmann MM, Winkelmann BR, Vosberg M, Jahn J, Entelmann M, Katus HA, Marz W, Mansmann U, Boehm BO, et al: The variable number of tandem repeat polymorphism in the P-selectin glycoprotein ligand-1 gene is not associated with coronary heart disease. J Mol Med. 2003, 81: 495-501. 10.1007/s00109-003-0459-2.

Fondon JW, Mele GM, Brezinschek RI, Cummings D, Pande A, Wren J, O'Brien KM, Kupfer KC, Wei MH, Lerman M, et al: Computerized polymorphic marker identification: experimental validation and a predicted human polymorphism catalog. Proc Natl Acad Sci USA. 1998, 95: 7514-7519. 10.1073/pnas.95.13.7514.

Wren JD, Forgacs E, Fondon JW, Pertsemlidis A, Cheng SY, Gallardo T, Williams RS, Shohet RV, Minna JD, Garner HR: Repeat polymorphisms within gene regions: phenotypic and evolutionary implications. Am J Hum Genet. 2000, 67: 345-356. 10.1086/303013.

Schuler GD, Boguski MS, Stewart EA, Stein LD, Gyapay G, Rice K, White RE, Rodriguez-Tome P, Aggarwal A, Bajorek E, et al: A gene map of the human genome. Science. 1996, 274: 540-546. 10.1126/science.274.5287.540.

Denoeud F, Vergnaud G, Benson G: Predicting human minisatellite polymorphism. Genome Res. 2003, 13: 856-867. 10.1101/gr.574403.

Naslund K, Saetre P, von Salome J, Bergstrom TF, Jareborg N, Jazin E: Genome-wide prediction of human VNTRs. Genomics. 2005, 85: 24-35. 10.1016/j.ygeno.2004.10.009.

Denoeud F, Vergnaud G: Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains: a web-based resource. BMC Bioinformatics. 2004, 5: 4-10.1186/1471-2105-5-4.

Jordon P, Snyder LA, Saunders NJ: Diversity in coding tandem repeats in related Neisseria spp. BMC Microbiol. 2003, 3: 23-10.1186/1471-2180-3-23.

Sylvestre P, Couture-Tosi E, Mock M: Polymorphism in the collagen-like region of the Bacillus anthracis BclA protein leads to variation in exosporium filament length. J Bacteriol. 2003, 185: 1555-1563. 10.1128/JB.185.5.1555-1563.2003.

van Belkum A, Scherer S, van Alphen L, Verbrugh H: Short-sequence DNA repeats in prokaryotic genomes. Microbiol Mol Biol Rev. 1998, 62: 275-293.

Li YC, Korol AB, Fahima T, Nevo E: Microsatellites within genes: structure, function, and evolution. Mol Biol Evol. 2004, 21: 991-1007. 10.1093/molbev/msh073.

Murphy PM: Molecular mimicry and the generation of host defense protein diversity. Cell. 1993, 72: 823-826. 10.1016/0092-8674(93)90571-7.

Shields DC, Harmon DL, Whitehead AS: Evolution of hemopoietic ligands and their receptors. Influence of positive selection on correlated replacements throughout ligand and receptor proteins. J Immunol . 1996, 156: 1062-1070.

Metzgar D, Bytof J, Wills C: Selection against frameshift mutations limits microsatellite expansion in coding DNA. Genome Res. 2000, 10: 72-80.

Dokholyan NV, Buldyrev SV, Havlin S, Stanley HE: Distributions of dimeric tandem repeats in noncoding and coding DNA sequences. J Theor Biol. 2000, 202: 273-282. 10.1006/jtbi.1999.1052.

Subramanian S, Mishra RK, Singh L: Genome-wide analysis of microsatellite repeats in humans: their abundance and density in specific genomic regions. Genome Biol. 2003, 4: R13-10.1186/gb-2003-4-2-r13.

Chang FM, Kidd JR, Livak KJ, Pakstis AJ, Kidd KK: The world-wide distribution of allele frequencies at the human dopamine D4 receptor locus. Hum Genet. 1996, 98: 91-101. 10.1007/s004390050166.

Afshar-Kharghan V, Diz-Kucukkaya R, Ludwig EH, Marian AJ, Lopez JA: Human polymorphism of P-selectin glycoprotein ligand 1 attributable to variable numbers of tandem decameric repeats in the mucinlike region. Blood. 2001, 97: 3306-3307. 10.1182/blood.V97.10.3306.

Toribara NW, Gum JR, Culhane PJ, Lagace RE, Hicks JW, Petersen GM, Kim YS: MUC-2 human small intestinal mucin gene structure. Repeated arrays and polymorphism. J Clin Invest. 1991, 88: 1005-1013.

Muckian C, Hillmann A, Kenny D, Shields DC: A novel variant of the platelet glycoprotein Ibalpha macroglycopeptide region lacks any copies of the 'perfect' 13 amino acid repeat. Thromb Haemost. 2000, 83: 513-514.

Matsuyama Z, Kawakami H, Maruyama H, Izumi Y, Komure O, Udaka F, Kameyama M, Nishio T, Kuroda Y, Nishimura M, Nakamura S: Molecular features of the CAG repeats of spinocerebellar ataxia 6 (SCA6). Hum Mol Genet. 1997, 6: 1283-1287. 10.1093/hmg/6.8.1283.

Koide R, Kobayashi S, Shimohata T, Ikeuchi T, Maruyama M, Saito M, Yamada M, Takahashi H, Tsuji S: A neurological disease caused by an expanded CAG trinucleotide repeat in the TATA-binding protein gene: a new polyglutamine disease?. Hum Mol Genet. 1999, 8: 2047-2053. 10.1093/hmg/8.11.2047.

Zuhlke C, Hellenbroich Y, Dalski A, Kononowa N, Hagenah J, Vieregge P, Riess O, Klein C, Schwinger E: Different types of repeat expansion in the TATA-binding protein gene are associated with a new form of inherited ataxia. Eur J Hum Genet. 2001, 9: 160-164. 10.1038/sj.ejhg.5200617.

Dean M, Park M, Le Beau MM, Robins TS, Diaz MO, Rowley JD, Blair DG, Vande Woude GF: The human met oncogene is related to the tyrosine kinase oncogenes. Nature. 1985, 318: 385-388. 10.1038/318385a0.

Stenson PD, Ball EV, Mort M, Phillips AD, Shiel JA, Thomas NS, Abeysinghe S, Krawczak M, Cooper DN: Human Gene Mutation Database (HGMD): 2003 update. Hum Mutat. 2003, 21: 577-581. 10.1002/humu.10212.

Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 2005, 33 (Database issue): D514-D517. 10.1093/nar/gki033.

Letovsky SI, Cottingham RW, Porter CJ, Li PW: GDB: the Human Genome Database. Nucleic Acids Res. 1998, 26: 94-99. 10.1093/nar/26.1.94.

Charmley P, Concannon P, Hood L, Rowen L: Frequency and polymorphism of simple sequence repeats in a contiguous 685-kb DNA sequence containing the human T-cell receptor beta-chain gene complex. Genomics. 1995, 29: 760-765. 10.1006/geno.1995.9940.

Kimmel M, Chakraborty R, Stivers DN, Deka R: Dynamics of repeat polymorphisms under a forward-backward mutation model: within- and between-population variability at microsatellite loci. Genetics. 1996, 143: 549-555.

Ota T, Kimura M: A model of mutation appropriate to estimate the number of electrophoretically detectable alleles in a finite population. Genet Res. 1973, 22: 201-204.

Schlotterer C, Tautz D: Slippage synthesis of simple sequence DNA. Nucleic Acids Res. 1992, 20: 211-215.

Weber JL: Informativeness of human (dC-dA)n.(dG-dT)n polymorphisms. Genomics. 1990, 7: 524-530. 10.1016/0888-7543(90)90195-Z.

Kunst CB, Leeflang EP, Iber JC, Arnheim N, Warren ST: The effect of FMR1 CGG repeat interruptions on mutation frequency as measured by sperm typing. J Med Genet. 1997, 34: 627-631.

Hubbard T, Andrews D, Caccamo M, Cameron G, Chen Y, Clamp M, Clarke L, Coates G, Cox T, Cunningham F, et al: Ensembl 2005. Nucleic Acids Res. 2005, 33 (Database issue): D447-D453. 10.1093/nar/gki138.

Lykke-Andersen J, Shu MD, Steitz JA: Human Upf proteins target an mRNA for nonsense-mediated decay when bound downstream of a termination codon. Cell. 2000, 103: 1121-1131. 10.1016/S0092-8674(00)00214-2.

Hughes AL, Packer B, Welch R, Bergen AW, Chanock SJ, Yeager M: Widespread purifying selection at polymorphic sites in human protein-coding loci. Proc Natl Acad Sci USA. 2003, 100: 15754-15757. 10.1073/pnas.2536718100.

van Den Hurk WH, Willems HJ, Bloemen M, Martens GJ: Novel frameshift mutations near short simple repeats. J Biol Chem. 2001, 276: 11496-11498. 10.1074/jbc.M011040200.

Karlin S, Burge C: Trinucleotide repeats and long homopeptides in genes and proteins associated with nervous system disease and development. Proc Natl Acad Sci USA. 1996, 93: 1560-1565. 10.1073/pnas.93.4.1560.

Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000, 25: 25-29. 10.1038/75556.

Hancock JM, Simon M: Simple sequence repeats in proteins and their significance for network evolution. Gene. 2005, 345: 113-118. 10.1016/j.gene.2004.11.023.

Hancock JM, Worthey EA, Santibanez-Koref MF: A role for selection in regulating the evolutionary emergence of disease-causing and other coding CAG repeats in humans and mice. Mol Biol Evol. 2001, 18: 1014-1023.

Alba MM, Laskowski RA, Hancock JM: Detecting cryptically simple protein sequences using the SIMPLE algorithm. Bioinformatics. 2002, 18: 672-678. 10.1093/bioinformatics/18.5.672.

Koide R, Ikeuchi T, Onodera O, Tanaka H, Igarashi S, Endo K, Takahashi H, Kondo R, Ishikawa A, Hayashi T, et al: Unstable expansion of CAG repeat in hereditary dentatorubral-pallidoluysian atrophy (DRPLA). Nat Genet. 1994, 6: 9-13. 10.1038/ng0194-9.

Kennedy WR, Alter M, Sung JH: Progressive proximal spinal and bulbar muscular atrophy of late onset. A sex-linked recessive trait. Neurology. 1968, 18: 671-680.

Hamada H, Seidman M, Howard BH, Gorman CM: Enhanced gene expression by the poly(dT-dG).poly(dC-dA) sequence. Mol Cell Biol . 1984, 4: 2622-2630.

Lu Q, Wallrath LL, Granok H, Elgin SC: (CT)n (GA)n repeats and heat shock elements have distinct roles in chromatin structure and transcriptional activation of the Drosophila hsp26 gene. Mol Cell Biol. 1993, 13: 2802-2814.

Yee HA, Wong AK, van de Sande JH, Rattner JB: Identification of novel single-stranded d(TC)n binding proteins in several mammalian species. Nucleic Acids Res. 1991, 19: 949-953.

Richards RI, Holman K, Yu S, Sutherland GR: Fragile X syndrome unstable element, p(CCG)n, and other simple tandem repeat sequences are binding sites for specific nuclear proteins. Hum Mol Genet. 1993, 2: 1429-1435.

Colafranceschi M, Colosimo A, Zbilut JP, Uversky VN, Giuliani A: Structure-related statistical singularities along protein sequences: a correlation study. J Chem Inf Model. 2005, 45: 183-189.

Fondon JW, Garner HR: Molecular origins of rapid and continuous morphological evolution. Proc Natl Acad Sci USA. 2004, 101: 18058-18063. 10.1073/pnas.0408118101.

Armitage P, Berry G: Statistical Methods in Medical Research. 1994, Oxford, UK: Blackwell Science, 3

Day IN, Alharbi KK, Smith M, Aldahmesh MA, Chen X, Lotery AJ, Pante-de-Sousa G, Hou G, Ye S, Eccles D, et al: Paucimorphic alleles versus polymorphic alleles and rare mutations in disease causation: theory, observation and detection. Curr Genomics. 2004, 5: 431-438. 10.2174/1389202043349156.

Skrabanek L, Campagne F: TissueInfo: high-throughput identification of tissue expression profiles and specificity. Nucleic Acids Res. 2001, 29: E102-10.1093/nar/29.21.e102.

Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999, 27: 573-580. 10.1093/nar/27.2.573.

Weber JL, David D, Heil J, Fan Y, Zhao C, Marth G: Human diallelic insertion/deletion polymorphisms. Am J Hum Genet. 2002, 71: 854-862. 10.1086/342727.

Weir BS: Genetic Data Analysis II: Methods for Discrete Population Genetic Data. 1996, Sunderland, MA: Sinauer, 2

Availability of data and materials

The assembly of the Ler genome has been submitted to the European Nucleotide Archive ( and is publicly available under the accession number GCA_900660825 [56]. The reads are available as part of a separate study under the project ID PRJEB31147 (preprint [46]). All other assemblies are publicly available at NCBI (, and their accession numbers are GCA_000001735.3 [57], GCA_000001405.27 [25], GCA_002077035.3 [22], GCA_001524155.4 [22], GCA_000146045.2 [27], GCA_000977955.2 [26], GCA_000001215.4 [29], GCA_002300595.1 [28], GCA_000005005.6 [31], and GCA_002237485.1 [30]. Further details about the assemblies are in Additional file 2: Table S1. BAM files for the 50 F2 recombinant genomes are available at European Nucleotide Archive under the project ID PRJEB29265 [33]. SyRI is freely available under the MIT license and is available online [58]. The version of SyRI used in this work is available at [59]. SyRI is developed using Python3.5 on Linux and can run on other operating systems as well.

Extensive copy-number variation of young genes across stickleback populations

Duplicate genes emerge as copy-number variations (CNVs) at the population level, and remain copy-number polymorphic until they are fixed or lost. The successful establishment of such structural polymorphisms in the genome plays an important role in evolution by promoting genetic diversity, complexity and innovation. To characterize the early evolutionary stages of duplicate genes and their potential adaptive benefits, we combine comparative genomics with population genomics analyses to evaluate the distribution and impact of CNVs across natural populations of an eco-genomic model, the three-spined stickleback. With whole genome sequences of 66 individuals from populations inhabiting three distinct habitats, we find that CNVs generally occur at low frequencies and are often only found in one of the 11 populations surveyed. A subset of CNVs, however, displays copy-number differentiation between populations, showing elevated within-population frequencies consistent with local adaptation. By comparing teleost genomes to identify lineage-specific genes and duplications in sticklebacks, we highlight rampant gene content differences among individuals in which over 30% of young duplicate genes are CNVs. These CNV genes are evolving rapidly at the molecular level and are enriched with functional categories associated with environmental interactions, depicting the dynamic early copy-number polymorphic stage of genes during population differentiation.

Conflict of interest statement

The authors have declared that no competing interests exist.


Figure 1. Phylogenomic relationships among samples.

Figure 1. Phylogenomic relationships among samples.

( A ) Phylogenomic network of the 66 genomes…

Figure 2. Frequency and occurrence of CNVs…

Figure 2. Frequency and occurrence of CNVs across individuals and populations.

Figure 3. CNV proportions across genomic regions…

Figure 3. CNV proportions across genomic regions and homozygous deletions.

Adeniyi-Jones, S., and Zasloff, M. (1985). Transcription, processing and nuclear transport of a B1 Alu RNA species complementary to an intron of the murine alpha-fetoprotein gene. Nature 317, 81�. doi: 10.1038/317081a0

Affara, N. I., and Coussens, L. M. (2007). IKKalpha at the crossroads of inflammation and metastasis. Cell 129, 25�. doi: 10.1016/j.cell.2007.03.029

Afridi, A. K., and Alam, K. (2004). Prevalence and etiology of obesity𠄺n overview. Pak. J. Nutr. 3, 14�. doi: 10.3923/pjn.2004.14.25

Akkad, A., Hastings, R., Konje, J. C., Bell, S. C., Thurston, H., and Williams, B. (2006). Telomere length in small-for-gestational-age babies. BJOG Int. J. Obstet. Gynaecol. 113, 318�. doi: 10.1111/j.1471-0528.2005.00839.x

Ambati, J., Magagnoli, J., Leung, H., and Wang, S. B. (2020). Repurposing anti-inflammasome NRTIs for improving insulin sensitivity and reducing type 2 diabetes development. Nat. Commun. 11:4737. doi: 10.1038/s41467-020-18528-z

Aoki, K., Suzuki, K., Sugano, T., Tasaka, T., Nakahara, K., Kuge, O., et al. (1995). A novel gene, Translin, encodes a recombination hotspot binding protein associated with chromosomal translocations. Nat. Genet. 10, 167�. doi: 10.1038/ng0695-167

Armes, J. E., Trute, L., White, D., Southey, M. C., Hammet, F., Tesoriero, A., et al. (1999). Distinct molecular pathogeneses of early-onset breast cancers in BRCA1 and BRCA2 mutation carriers: a population-based study. Cancer Res. 59, 2011�.

Belancio, V. P., Hedges, D. J., and Deininger, P. (2006). LINE-1 RNA splicing and influences on mammalian gene expression. Nucleic Acids Res. 34, 1512�. doi: 10.1093/nar/gkl027

Belancio, V. P., Roy-Engel, A. M., Pochampally, R. R., and Deininger, P. (2010). Somatic expression of LINE-1 elements in human tissues. Nucleic Acids Res. 38, 3909�. doi: 10.1093/nar/gkq132

Bell, M. V., Hirst, M. C., Nakahori, Y., MacKinnon, R. N., Roche, A., Flint, T. J., et al. (1991). Physical mapping across the fragile X: hypermethylation and clinical expression of the fragile X syndrome. Cell 64, 861�. doi: 10.1016/0092-8674(91)90514-Y

Bergsagel, D. E., and Valeriote, F. A. (1968). Growth characteristics of a mouse plasma cell tumor. Cancer Res. 28, 2187�.

Bewick, A. J., Hofmeister, B. T., Powers, R. A., Mondo, S. J., and Grigoriev, I. V. (2019). Diversity of cytosine methylation across the fungal tree of life. Nat. Ecol. Eol. 3, 479�. doi: 10.1038/s41559-019-0810-9

Bewick, A. J., Vogel, K. J., Moore, A. J., and Schmitz, R. J. (2017). Evolution of DNA methylation across insects. Mol. Biol. Evol. 34, 654�. doi: 10.1093/molbev/msx067

Bhavadharini, B., and Mohan, V. (2020). White rice intake and incident diabetes: a study of 132,373 participants in 21 countries. Diabetes Care 43, 2643�. doi: 10.2337/dc19-2335

Bollati, V., Schwartz, J., Wright, R., Litonjua, A., Tarantini, L., Suh, H., et al. (2009). Decline in genomic DNA methylation through aging in a cohort of elderly subjects. Mech. Ageing Dev. 130, 234�. doi: 10.1016/j.mad.2008.12.003

Bonnet, D., and Dick, J. E. (1997). Human acute myeloid leukemia is organized as a hierarchy that originates from a primitive hematopoietic cell. Nat. Med. 3, 730�. doi: 10.1038/nm0797-730

Bonney, P. A., Boettcher, L. B., Krysiak, R. S. III, Fung, K. M., and Sughrue, M. E. (2015). Histology and molecular aspects of central neurocytoma. Neurosurg. Clin. N. Am. 26, 21�. doi: 10.1016/

Bostick, M., Kim, J. K., Estève, P. O., Clark, A., Pradhan, S., and Jacobsen, S. E. (2007). UHRF1 plays a role in maintaining DNA methylation in mammalian cells. Science 317, 1760�. doi: 10.1126/science.1147939

Brannan, K., Kim, H., Erickson, B., Glover-Cutter, K., Kim, S., Fong, N., et al. (2012). mRNA decapping factors and the exonuclease Xrn2 function in widespread premature termination of RNA polymerase II transcription. Mol. Cell 46, 311�. doi: 10.1016/j.molcel.2012.03.006

Bruce, W. R., and Van Der Gaag, H. (1963). A quantitative assay for the number of murine lymphoma cells capable of proliferation in vivo. Nature 199, 79�. doi: 10.1038/199079a0

Burnette, J. M., Miyamoto-Sato, E., Schaub, M. A., Conklin, J., and Lopez, A. J. (2005). Subdivision of large introns in Drosophila by recursive splicing at nonexonic elements. Genetics 170, 661�. doi: 10.1534/genetics.104.039701

Cartwright, R., Tambini, C. E., Simpson, P. J., and Thacker, J. (1998). The XRCC2 DNA repair gene from human and mouse encodes a novel member of the recA/RAD51 family. Nucleic Acids Res. 26, 3084�. doi: 10.1093/nar/26.13.3084

Catania, S., Dumesic, P. A., Pimentel, H., Nasif, A., Stoddard, C. I., Burke, J. E., et al. (2020). Evolutionary persistence of DNA methylation for millions of years after ancient loss of a de novo methyltransferase. Cell 180, 263�.e220. doi: 10.1016/j.cell.2019.12.012

Chang, E., and Harley, C. B. (1995). Telomere length and replicative aging in human vascular tissues. Proc. Natl. Acad. Sci. U.S.A. 92, 11190�. doi: 10.1073/pnas.92.24.11190

Chatterjee, A., Ozaki, Y., Stockwell, P. A., Horsfield, J. A., Morison, I. M., and Nakagawa, S. (2013). Mapping the zebrafish brain methylome using reduced representation bisulfite sequencing. Epigenetics 8, 979�. doi: 10.4161/epi.25797

Cherif, H., Tarry, J. L., Ozanne, S. E., and Hales, C. N. (2003). Ageing and telomeres: a study into organ- and gender-specific telomere shortening. Nucleic Acids Res. 31, 1576�. doi: 10.1093/nar/gkg208

Cho, N. Y., Kim, B. H., Choi, M., Yoo, E. J., Moon, K. C., Cho, Y. M., et al. (2007). Hypermethylation of CpG island loci and hypomethylation of LINE-1 and Alu repeats in prostate adenocarcinoma and their relationship to clinicopathological features. J. Pathol. 211, 269�. doi: 10.1002/path.2106

Chu, W. M., Liu, W. M., and Schmid, C. W. (1995). RNA polymerase III promoter and terminator elements affect Alu RNA expression. Nucleic Acids Res. 23, 1750�. doi: 10.1093/nar/23.10.1750

Claycomb, J. M., Benasutti, M., Bosco, G., Fenger, D. D., and Orr-Weaver, T. L. (2004). Gene amplification as a developmental strategy: isolation of two developmental amplicons in Drosophila. Dev. Cell 6, 145�. doi: 10.1016/S1534-5807(03)00398-8

Cleary, J. D., Tomé, S., López Castel, A., Panigrahi, G. B., Foiry, L., Hagerman, K. A., et al. (2010). Tissue- and age-specific DNA replication patterns at the CTG/CAG-expanded human myotonic dystrophy type 1 locus. Nat. Struct. Mol. Biol. 17, 1079�. doi: 10.1038/nsmb.1876

Cohen, S., Menut, S., and Mຜhali, M. (1999). Regulated formation of extrachromosomal circular DNA molecules during development in Xenopus laevis. Mol. Cell. Biol. 19, 6682�. doi: 10.1128/MCB.19.10.6682

Cohen, S., Yacobi, K., and Segal, D. (2003). Extrachromosomal circular DNA of tandemly repeated genomic sequences in Drosophila. Genome Res. 13, 1133�. doi: 10.1101/gr.907603

Cohn, L. B., Silva, I. T., Oliveira, T. Y., Rosales, R. A., Parrish, E. H., Learn, G. H., et al. (2015). HIV-1 integration landscape during latent and active infection. Cell 160, 420�. doi: 10.1016/j.cell.2015.01.020

Colot, V., Maloisel, L., and Rossignol, J. L. (1999). DNA repeats and homologous recombination: a probable role for DNA methylation in genome stability of eukaryotic cells. J. Soc. Biol. 193, 29� doi: 10.1051/jbio/1999193010029

Copson, E. R., Maishman, T. C., Tapper, W. J., Cutress, R. I., Greville-Heygate, S., Altman, D. G., et al. (2018). Germline BRCA mutation and outcome in young-onset breast cancer (POSH): a prospective cohort study. Lancet Oncol. 19, 169�. doi: 10.1016/S1470-2045(17)30891-4

Cost, G. J., and Boeke, J. D. (1998). Targeting of human retrotransposon integration is directed by the specificity of the L1 endonuclease for regions of unusual DNA structure. Biochemistry 37, 18081�. doi: 10.1021/bi981858s

Cronister, A., Teicher, J., Rohlfs, E. M., Donnenfeld, A., and Hallam, S. (2008). Prevalence and instability of fragile X alleles: implications for offering fragile X prenatal diagnosis. Obstet. Gynecol. 111, 596�. doi: 10.1097/AOG.0b013e318163be0b

Danilevskaya, O. N., Arkhipova, I. R., Traverse, K. L., and Pardue, M. L. (1997). Promoting in tandem: the promoter for telomere transposon HeT-A and implications for the evolution of retroviral LTRs. Cell 88, 647�. doi: 10.1016/S0092-8674(00)81907-8

Das, S., Bonaguidi, M., Muro, K., and Kessler, J. A. (2008). Generation of embryonic stem cells: limitations of and alternatives to inner cell mass harvest. Neurosurg. Focus 24:E4. doi: 10.3171/FOC/2008/24/3-4/E3

de Graaff, E., Willemsen, R., Zhong, N., de Die-Smulders, C. E., Brown, W. T., Freling, G., et al. (1995). Instability of the CGG repeat and expression of the FMR1 protein in a male fragile X patient with a lung tumor. Am. J. Hum. Genet. 57, 609�.

Deininger, P. L., and Batzer, M. A. (1999). Alu repeats and human disease. Mol. Genet. Metab. 67, 183�. doi: 10.1006/mgme.1999.2864

Demanelis, K., Jasmine, F., Chen, L. S., and Chernoff, M. (2020). Determinants of telomere length across human tissues. Science 369:eaaz6876. doi: 10.1126/science.aaz6876

Dewannieux, M., Esnault, C., and Heidmann, T. (2003). LINE-mediated retrotransposition of marked Alu sequences. Nat. Genet. 35, 41�. doi: 10.1038/ng1223

Dillon, L. W., Kumar, P., Shibata, Y., Wang, Y. H., Willcox, S., Griffith, J. D., et al. (2015). Production of extrachromosomal MicroDNAs is linked to mismatch repair pathways and transcriptional activity. Cell Rep. 11, 1749�. doi: 10.1016/j.celrep.2015.05.020

Dokal, I. (2000). Dyskeratosis congenita in all its forms. Br. J. Haematol. 110, 768�. doi: 10.1046/j.1365-2141.2000.02109.x

Doshi, K. D., Yang, A. S., Youssef, E., Shen, L. L., and Issa, J. P. J. (2004). Age and cancer related changes of Alu element DNA methylation in colon. Cancer Res. 64.

Duan, R., Luo, S., Huang, W., Li, H., Peng, Y., Du, Q., et al. (2016). Analysis of CGG repeat instability in germline cells from two male fetuses affected with fragile X syndrome. Zhonghua Yi Xue Yi Chuan Xue Za Zhi 33, 606�. doi: 10.3760/cma.j.issn.1003-9406.2016.05.005

Esnault, C., Maestre, J., and Heidmann, T. (2000). Human LINE retrotransposons generate processed pseudogenes. Nat. Genet. 24, 363�. doi: 10.1038/74184

Evenson, D. P., Darzynkiewicz, Z., and Melamed, M. R. (1980). Relation of mammalian sperm chromatin heterogeneity to fertility. Science 210, 1131�. doi: 10.1126/science.7444440

Evgeni, E., Lymberopoulos, G., Touloupidis, S., and Asimakopoulos, B. (2015). Sperm nuclear DNA fragmentation and its association with semen quality in Greek men. Andrologia 47, 1166�. doi: 10.1111/and.12398

Fidler, I. J., and Hart, I. R. (1982). Biological diversity in metastatic neoplasms: origins and implications. Science 217, 998�. doi: 10.1126/science.7112116

Fidler, I. J., and Kripke, M. L. (1977). Metastasis results from preexisting variant cells within a malignant tumor. Science 197, 893�. doi: 10.1126/science.887927

Field, M., Dudding-Byth, T., Arpone, M., and Baker, E. K. (2019). Significantly elevated FMR1 mRNA and mosaicism for methylated premutation and full mutation alleles in two brothers with autism features referred for fragile X testing. Int. J. Mol. Sci. 20:3907. doi: 10.3390/ijms20163907

Filatov, L. V., Mamayeva, S. E., and Tomilin, N. V. (1987). Non-random distribution of Alu-family repeats in human chromosomes. Mol. Biol. Rep. 12, 117�. doi: 10.1007/BF00368879

Fischer, U., Keller, A., Voss, M., Backes, C., Welter, C., and Meese, E. (2012). Genome-wide gene amplification during differentiation of neural progenitor cells in vitro. PLoS ONE 7:e37422. doi: 10.1371/journal.pone.0037422

Fischer, U., Kim, E., Keller, A., and Meese, E. (2017). Specific amplifications and copy number decreases during human neural stem cells differentiation towards astrocytes, neurons and oligodendrocytes. Oncotarget 8, 25872�. doi: 10.18632/oncotarget.15980

Fortune, M. T., Vassilopoulos, C., Coolbaugh, M. I., Siciliano, M. J., and Monckton, D. G. (2000). Dramatic, expansion-biased, age-dependent, tissue-specific somatic mosaicism in a transgenic mouse model of triplet repeat instability. Hum. Mol. Genet. 9, 439�. doi: 10.1093/hmg/9.3.439

Gasior, S. L., Wakeman, T. P., Xu, B., and Deininger, P. L. (2006). The human LINE-1 retrotransposon creates DNA double-strand breaks. J. Mol. Biol. 357, 1383�. doi: 10.1016/j.jmb.2006.01.089

Gottipati, P., Cassel, T. N., Savolainen, L., and Helleday, T. (2008). Transcription-associated recombination is dependent on replication in mammalian cells. Mol. Cell. Biol. 28, 154�. doi: 10.1128/MCB.00816-07

Greally, J. M. (2002). Short interspersed transposable elements (SINEs) are excluded from imprinted regions in the human genome. Proc. Natl. Acad. Sci. U.S.A. 99, 327�. doi: 10.1073/pnas.012539199

Gritti, A., Frölichsthal-Schoeller, P., Galli, R., Parati, E. A., Cova, L., Pagano, S. F., et al. (1999). Epidermal and fibroblast growth factors behave as mitogenic regulators for a single multipotent stem cell-like population from the subventricular region of the adult mouse forebrain. J. Neurosci. 19, 3287�. doi: 10.1523/JNEUROSCI.19-09-03287.1999

Grover, D., Majumder, P. P., C, B. R., Brahmachari, S. K., and Mukerji, M. (2003). Nonrandom distribution of alu elements in genes of various functional categories: insight from analysis of human chromosomes 21 and 22. Mol. Biol. Evol. 20, 1420�. doi: 10.1093/molbev/msg153

Grover, D., Mukerji, M., Bhatnagar, P., Kannan, K., and Brahmachari, S. K. (2004). Alu repeat analysis in the complete human genome: trends and variations with respect to genomic composition. Bioinformatics 20, 813�. doi: 10.1093/bioinformatics/bth005

Guo, F., Yan, L., Guo, H., Li, L., Hu, B., Zhao, Y., et al. (2015). The transcriptome and DNA methylome landscapes of human primordial germ cells. Cell 161, 1437�. doi: 10.1016/j.cell.2015.05.015

Guo, H., Karberg, M., Long, M., Jones, J. P. III, Sullenger, B., and Lambowitz, A. M. (2000). Group II introns designed to insert into therapeutically relevant DNA target sites in human cells. Science 289, 452�. doi: 10.1126/science.289.5478.452

Guo, H., Zhu, P., Wu, X., Li, X., Wen, L., and Tang, F. (2013). Single-cell methylome landscapes of mouse embryonic stem cells and early embryos analyzed using reduced representation bisulfite sequencing. Genome Res. 23, 2126�. doi: 10.1101/gr.161679.113

Guo, H., Zhu, P., Yan, L., Li, R., Hu, B., Lian, Y., et al. (2014). The DNA methylation landscape of human early embryos. Nature 511, 606�. doi: 10.1038/nature13544

Guo, H., Zimmerly, S., Perlman, P. S., and Lambowitz, A. M. (1997). Group II intron endonucleases use both RNA and protein subunits for recognition of specific sequences in double-stranded DNA. EMBO J. 16, 6835�. doi: 10.1093/emboj/16.22.6835

Gupta, R. C., Golub, E. I., Wold, M. S., and Radding, C. M. (1998). Polarity of DNA strand exchange promoted by recombination proteins of the RecA family. Proc. Natl. Acad. Sci. U.S.A. 95, 9843�. doi: 10.1073/pnas.95.17.9843

Halytskiy, V. (2009). Recombination in telomere DNA as the ageing booster. Ukrain. Biokhim. Z. 81:163.

Hamazaki, N., Kyogoku, H., Araki, H., and Miura, F. (2021). Reconstitution of the oocyte transcriptional network with transcription factors. Nature. 589, 264�. doi: 10.1038/s41586-020-3027-9

Hamburger, A. W., and Salmon, S. E. (1977). Primary bioassay of human tumor stem cells. Science 197, 461�. doi: 10.1126/science.560061

Han, T. S., Sattar, N., and Lean, M. (2006). ABC of obesity. Assessment of obesity and its clinical implications. BMJ 333, 695�. doi: 10.1136/bmj.333.7570.695

Hannibal, R. L., and Baker, J. C. (2016). Selective amplification of the genome surrounding key placental genes in trophoblast giant cells. Curr. Biol. 26, 230�. doi: 10.1016/j.cub.2015.11.060

Hao, L. Y., Armanios, M., Strong, M. A., Karim, B., Feldser, D. M., Huso, D., et al. (2005). Short telomeres, even in the presence of telomerase, limit tissue renewal capacity. Cell 123, 1121�. doi: 10.1016/j.cell.2005.11.020

Harley, C. B., Futcher, A. B., and Greider, C. W. (1990). Telomeres shorten during ageing of human fibroblasts. Nature 345, 458�. doi: 10.1038/345458a0

Harris, C. R., Normart, R., Yang, Q., Stevenson, E., Haffty, B. G., Ganesan, S., et al. (2010). Association of nuclear localization of a long interspersed nuclear element-1 protein in breast tumors with poor prognostic outcomes. Genes Cancer 1, 115�. doi: 10.1177/1947601909360812

Harris, N. L., and Senapathy, P. (1990). Distribution and consensus of branch point signals in eukaryotic genes: a computerized statistical analysis. Nucleic Acids Res. 18, 3015�. doi: 10.1093/nar/18.10.3015

Hartenstine, M. J., Goodman, M. F., and Petruska, J. (2000). Base stacking and even/odd behavior of hairpin loops in DNA triplet repeat slippage and expansion with DNA polymerase. J. Biol. Chem. 275, 18382�. doi: 10.1074/jbc.275.24.18382

Häsler, J., and Strub, K. (2006). Alu elements as regulators of gene expression. Nucleic Acids Res. 34, 5491�. doi: 10.1093/nar/gkl706

Henrichsen, C. N., Vinckenbosch, N., Zöllner, S., Chaignat, E., Pradervand, S., Schütz, F., et al. (2009). Segmental copy number variation shapes tissue transcriptomes. Nat. Genet. 41, 424�. doi: 10.1038/ng.345

Herrera, E., Samper, E., Martín-Caballero, J., Flores, J. M., Lee, H. W., and Blasco, M. A. (1999). Disease states associated with telomerase deficiency appear earlier in mice with short telomeres. EMBO J. 18, 2950�. doi: 10.1093/emboj/18.11.2950

Hirasawa, R., Chiba, H., Kaneda, M., Tajima, S., Li, E., Jaenisch, R., et al. (2008). Maternal and zygotic Dnmt1 are necessary and sufficient for the maintenance of DNA methylation imprints during preimplantation development. Genes Dev. 22, 1607�. doi: 10.1101/gad.1667008

Hohjoh, H., and Singer, M. F. (1996). Cytoplasmic ribonucleoprotein complexes containing human LINE-1 protein and RNA. EMBO J. 15, 630� doi: 10.1002/j.1460-2075.1996.tb00395.x

Hu, E. A., Pan, A., Malik, V., and Sun, Q. (2012). White rice consumption and risk of type 2 diabetes: meta-analysis and systematic review. BMJ 344:e1454. doi: 10.1136/bmj.e1454

Iafrate, A. J., Feuk, L., Rivera, M. N., Listewnik, M. L., Donahoe, P. K., Qi, Y., et al. (2004). Detection of large-scale variation in the human genome. Nat. Genet. 36, 949�. doi: 10.1038/ng1416

Invernizzi, P., Bernuzzi, F., Lleo, A., Pozzoli, V., Bignotto, M., Zermiani, P., et al. (2014). Telomere dysfunction in peripheral blood mononuclear cells from patients with primary biliary cirrhosis. Digest. Liver Dis. 46, 363�. doi: 10.1016/j.dld.2013.11.008

Issa, J. P., Ahuja, N., Toyota, M., Bronner, M. P., and Brentnall, T. A. (2001). Accelerated age-related CpG island methylation in ulcerative colitis. Cancer Res. 61, 3573�. doi: 10.1046/j.1523-5394.2001.009003155.x

Jaeckle, K. A., Decker, P. A., Ballman, K. V., Flynn, P. J., Giannini, C., Scheithauer, B. W., et al. (2011). Transformation of low grade glioma and correlation with outcome: an NCCTG database analysis. J. Neurooncol. 104, 253�. doi: 10.1007/s11060-010-0476-2

Jang, K. L., and Latchman, D. S. (1989). HSV infection induces increased transcription of Alu repeated sequences by RNA polymerase III. FEBS Lett. 258, 255�. doi: 10.1016/0014-5793(89)81667-9

Jeffs, A. R., Benjes, S. M., Smith, T. L., Sowerby, S. J., and Morris, C. M. (1998). The BCR gene recombines preferentially with Alu elements in complex BCR-ABL translocations of chronic myeloid leukaemia. Hum. Mol. Genet. 7, 767�. doi: 10.1093/hmg/7.5.767

Jiménez-Zurdo, J. I., Garc໚-Rodríguez, F. M., Barrientos-Durán, A., and Toro, N. (2003). DNA target site requirements for homing in vivo of a bacterial group II intron encoding a protein lacking the DNA endonuclease domain. J. Mol. Biol. 326, 413�. doi: 10.1016/S0022-2836(02)01380-3

Jurka, J., and Zuckerkandl, E. (1991). Free left arms as precursor molecules in the evolution of Alu sequences. J. Mol. Evol. 33, 49�. doi: 10.1007/BF02100195

Kabacik, S., Horvath, S., Cohen, H., and Raj, K. (2018). Epigenetic ageing is distinct from senescence-mediated ageing and is not prevented by telomerase expression. Aging 10, 2800�. doi: 10.18632/aging.101588

Kaminker, P. G., Kim, S. H., Taylor, R. D., Zebarjadian, Y., Funk, W. D., Morin, G. B., et al. (2001). TANK2, a new TRF1-associated poly(ADP-ribose) polymerase, causes rapid induction of cell death upon overexpression. J. Biol. Chem. 276, 35891�. doi: 10.1074/jbc.M105968200

Kelly, S., Georgomanolis, T., Zirkel, A., Diermeier, S., O'Reilly, D., Murphy, S., et al. (2015). Splicing of many human genes involves sites embedded within introns. Nucleic Acids Res. 43, 4721�. doi: 10.1093/nar/gkv386

Khitrinskaia, I., Stepanov, V. A., and Puzyrev, V. P. (2003). Alu repeats in the human genome. Mol. Biol. 37, 382� doi: 10.1023/A:1024218806634

Korbel, J. O., Urban, A. E., Affourtit, J. P., Godwin, B., Grubert, F., Simons, J. F., et al. (2007). Paired-end mapping reveals extensive structural variation in the human genome. Science 318, 420�. doi: 10.1126/science.1149504

Kuchenbaecker, K. B., Hopper, J. L., Barnes, D. R., Phillips, K. A., Mooij, T. M., Roos-Blom, M. J., et al. (2017). Risks of breast, ovarian, and contralateral breast cancer for BRCA1 and BRCA2 mutation carriers. JAMA 317, 2402�. doi: 10.1001/jama.2017.7112

Kulpa, D. A., and Moran, J. V. (2006). Cis-preferential LINE-1 reverse transcriptase activity in ribonucleoprotein particles. Nat. Struct. Mol. Biol. 13, 655�. doi: 10.1038/nsmb1107

Kuper, H., Adami, H. O., and Trichopoulos, D. (2000). Infections as a major preventable cause of human cancer. J. Intern. Med. 248, 171�. doi: 10.1046/j.1365-2796.2000.00742.x

Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin, J., et al. (2001). Initial sequencing and analysis of the human genome. Nature 409, 860�. doi: 10.1038/35057062

Lauer, S., and Avecilla, G. (2018). Single-cell copy number variant detection reveals the dynamics and diversity of adaptation. PLoS Biol. 16:e3000069. doi: 10.1371/journal.pbio.3000069

Lee, H. W., Blasco, M. A., Gottlieb, G. J., Horner, J. W. II., Greider, C. W., and DePinho, R. A. (1998). Essential role of mouse telomerase in highly proliferative organs. Nature 392, 569�. doi: 10.1038/33345

Lee, J. M., Zhang, J., Su, A. I., Walker, J. R., Wiltshire, T., Kang, K., et al. (2010). A novel approach to investigate tissue-specific trinucleotide repeat instability. BMC Syst. Biol. 4:29. doi: 10.1186/1752-0509-4-29

Leonard, D., Ajuh, P., Lamond, A. I., and Legerski, R. J. (2003). hLodestar/HuF2 interacts with CDC5L and is involved in pre-mRNA splicing. Biochem. Biophys. Res. Commun. 308, 793�. doi: 10.1016/S0006-291X(03)01486-4

Li, S., Teng, S., Xu, J., Su, G., Zhang, Y., Zhao, J., et al. (2019). Microarray is an efficient tool for circRNA profiling. Brief. Bioinformatics 20, 1420�. doi: 10.1093/bib/bby006

Li, S. H., Schilling, G., Young, W. S. III., Li, X. J., Margolis, R. L., Stine, O. C., et al. (1993). Huntington's disease gene (IT15) is widely expressed in human and rat tissues. Neuron 11, 985�. doi: 10.1016/0896-6273(93)90127-D

Li, X., and Manley, J. L. (2005). Inactivation of the SR protein splicing factor ASF/SF2 results in genomic instability. Cell 122, 365�. doi: 10.1016/j.cell.2005.06.008

Li, Y., Zhang, Z., Chen, J., Liu, W., Lai, W., Liu, B., et al. (2018). Stella safeguards the oocyte methylome by preventing de novo methylation mediated by DNMT1. Nature 564, 136�. doi: 10.1038/s41586-018-0751-5

Liu, G., and Leffak, M. (2012). Instability of (CTG)n·(CAG)n trinucleotide repeats and DNA synthesis. Cell Biosci. 2:7. doi: 10.1186/2045-3701-2-7

Liu, L., Bailey, S. M., Okuka, M., Muñoz, P., Li, C., Zhou, L., et al. (2007). Telomere lengthening early in development. Nat. Cell Biol. 9, 1436�. doi: 10.1038/ncb1664

Liu, M., Xie, Z., and Price, D. H. (1998). A human RNA polymerase II transcription termination factor is a SWI2/SNF2 family member. J. Biol. Chem. 273, 25541�. doi: 10.1074/jbc.273.40.25541

Liu, W. M., Chu, W. M., Choudary, P. V., and Schmid, C. W. (1995). Cell stress and translational inhibitors transiently increase the abundance of mammalian SINE transcripts. Nucleic Acids Res. 23, 1758�. doi: 10.1093/nar/23.10.1758

Liu, W. M., Maraia, R. J., Rubin, C. M., and Schmid, C. W. (1994). Alu transcripts: cytoplasmic localisation and regulation by DNA methylation. Nucleic Acids Res. 22, 1087�. doi: 10.1093/nar/22.6.1087

Liu, Y., and Mi, Y. (2019). Multi-omic measurements of heterogeneity in HeLa cells across laboratories. Nat. Biotechnol. 37, 314�. doi: 10.1038/s41587-019-0037-y

Loft, S., Kold-Jensen, T., Hjollund, N. H., Giwercman, A., Gyllemborg, J., Ernst, E., et al. (2003). Oxidative DNA damage in human sperm influences time to pregnancy. Hum. Reprod. 18, 1265�. doi: 10.1093/humrep/deg202

Lovekin, C., Ellis, I. O., Locker, A., Robertson, J. F., Bell, J., Nicholson, R., et al. (1991). c-erbB-2 oncoprotein expression in primary and advanced breast cancer. Br. J. Cancer 63, 439�. doi: 10.1038/bjc.1991.101

Maestre, J., Tchénio, T., Dhellin, O., and Heidmann, T. (1995). mRNA retroposition in human cells: processed pseudogene formation. EMBO J. 14, 6333� doi: 10.1002/j.1460-2075.1995.tb00324.x

Maldarelli, F., Wu, X., Su, L., Simonetti, F. R., Shao, W., Hill, S., et al. (2014). HIV latency. Specific HIV integration sites are linked to clonal expansion and persistence of infected cells. Science 345, 179�. doi: 10.1126/science.1254194

Maraia, R., Zasloff, M., Plotz, P., and Adeniyi-Jones, S. (1988). Pathway of B1-Alu expression in microinjected oocytes: Xenopus laevis proteins associated with nuclear precursor and processed cytoplasmic RNAs. Mol. Cell. Biol. 8, 4433�. doi: 10.1128/MCB.8.10.4433

Maraia, R. J. (1991). The subset of mouse B1 (Alu-equivalent) sequences expressed as small processed cytoplasmic transcripts. Nucleic Acids Res. 19, 5695�. doi: 10.1093/nar/19.20.5695

Maraia, R. J., Driscoll, C. T., Bilyeu, T., Hsu, K., and Darlington, G. J. (1993). Multiple dispersed loci produce small cytoplasmic Alu RNA. Mol. Cell. Biol. 13, 4233�. doi: 10.1128/MCB.13.7.4233

Martin, S. L. (2006). The ORF1 protein encoded by LINE-1: structure and function during L1 retrotransposition. J. Biomed. Biotechnol. 2006:45621. doi: 10.1155/JBB/2006/45621

Matera, A. G., Hellmann, U., and Schmid, C. W. (1990). A transpositionally and transcriptionally competent Alu subfamily. Mol. Cell. Biol. 10, 5424�. doi: 10.1128/MCB.10.10.5424

Mathew, B. C., Daniel, R. S., Nabi, A., David, T., and Campbell, I. W. (2010). Telomere and telomerase in cancer and ageing. Jamahiriya Med. J. 10, 86�.

Mathias, S. L., Scott, A. F., Kazazian, H. H. Jr., Boeke, J. D., and Gabriel, A. (1991). Reverse transcriptase encoded by a human transposable element. Science 254, 1808�. doi: 10.1126/science.1722352

Matsutani, S. (2006). Links between repeated sequences. J. Biomed. Biotechnol. 2006:13569. doi: 10.1155/JBB/2006/13569

McConnell, M. J., Lindberg, M. R., Brennand, K. J., Piper, J. C., Voet, T., Cowing-Zitron, C., et al. (2013). Mosaic copy number variation in human neurons. Science 342, 632�. doi: 10.1126/science.1243472

Meinhardt, G., Kaltenberger, S., Fiala, C., Kn཯ler, M., and Pollheimer, J. (2015). ERBB2 gene amplification increases during the transition of proximal EGFR(+) to distal HLA-G(+) first trimester cell column trophoblasts. Placenta 36, 803�. doi: 10.1016/j.placenta.2015.05.017

Meyerson, M., Counter, C. M., Eaton, E. N., Ellisen, L. W., Steiner, P., Caddle, S. D., et al. (1997). hEST2, the putative human telomerase catalytic subunit gene, is up-regulated in tumor cells and during immortalization. Cell 90, 785�. doi: 10.1016/S0092-8674(00)80538-3

Muotri, A. R., Chu, V. T., Marchetto, M. C., Deng, W., Moran, J. V., and Gage, F. H. (2005). Somatic mosaicism in neuronal precursor cells mediated by L1 retrotransposition. Nature 435, 903�. doi: 10.1038/nature03663

Murata, M. (2018). Inflammation and cancer. Environ. Health Prev. Med. 23:50. doi: 10.1186/s12199-018-0740-1

Muratori, M., Tamburrino, L., Marchiani, S., Cambi, M., Olivito, B., Azzari, C., et al. (2015). Investigation on the origin of sperm DNA fragmentation: role of apoptosis, immaturity and oxidative stress. Mol. Med. 21, 109�. doi: 10.2119/molmed.2014.00158

Nei, M., and Rooney, A. P. (2005). Concerted and birth-and-death evolution of multigene families. Annu. Rev. Genet. 39, 121�. doi: 10.1146/annurev.genet.39.073003.112240

Niu, D., Li, L., Yu, Y., Zang, W., Li, Z., Zhou, L., et al. (2020). Evaluation of next generation sequencing for detecting HER2 copy number in breast and gastric cancers. Pathol. Oncol. Res. 26, 2577�. doi: 10.1007/s12253-020-00844-w

Nolin, S. L., Brown, W. T., Glicksman, A., Houck, G. E. Jr., Gargano, A. D., Sullivan, A., et al. (2003). Expansion of the fragile X CGG repeat in females with premutation or intermediate alleles. Am. J. Hum. Genet. 72, 454�. doi: 10.1086/367713

O'Reilly, S. M., Barnes, D. M., Camplejohn, R. S., Bartkova, J., Gregory, W. M., and Richards, M. A. (1991). The relationship between c-erbB-2 expression, S-phase fraction and prognosis in breast cancer. Br. J. Cancer 63, 444�. doi: 10.1038/bjc.1991.102

Ostertag, E. M., DeBerardinis, R. J., Goodier, J. L., Zhang, Y., Yang, N., Gerton, G. L., et al. (2002). A mouse model of human L1 retrotransposition. Nat. Genet. 32, 655�. doi: 10.1038/ng1022

Ouyang, S. P., Liu, Q., Fang, L., and Chen, G. Q. (2007). Construction of pha-operon-defined knockout mutants of Pseudomonas putida KT2442 and their applications in poly(hydroxyalkanoate) production. Macromol. Biosci. 7, 227�. doi: 10.1002/mabi.200600187

Ozawa, M., Sakatani, M., Yao, J., Shanker, S., Yu, F., Yamashita, R., et al. (2012). Global gene expression of the inner cell mass and trophectoderm of the bovine blastocyst. BMC Dev. Biol. 12:33. doi: 10.1186/1471-213X-12-33

Pan, Q., Shai, O., Lee, L. J., Frey, B. J., and Blencowe, B. J. (2008). Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 40, 1413�. doi: 10.1038/ng.259

Panda, A. C., De, S., Grammatikakis, I., Munk, R., Yang, X., Piao, Y., et al. (2017). High-purity circular RNA isolation method (RPAD) reveals vast collection of intronic circRNAs. Nucleic Acids Res. 45:e116. doi: 10.1093/nar/gkx297

Pandya-Jones, A., and Black, D. L. (2009). Co-transcriptional splicing of constitutive and alternative exons. RNA 15, 1896�. doi: 10.1261/rna.1714509

Park, C. H., Bergsagel, D. E., and McCulloch, E. A. (1971). Mouse myeloma tumor stem cells: a primary cell culture assay. J. Natl. Cancer Inst. 46, 411�.

Paterson, M. C., Dietrich, K. D., Danyluk, J., Paterson, A. H., Lees, A. W., Jamil, N., et al. (1991). Correlation between c-erbB-2 amplification and risk of recurrent disease in node-negative breast cancer. Cancer Res. 51, 556�.

Percharde, M., Lin, C. J., Yin, Y., Guan, J., Peixoto, G. A., Bulut-Karslioglu, A., et al. (2018). A LINE1-nucleolin partnership regulates early development and ESC identity. Cell 174, 391�.e319. doi: 10.1016/j.cell.2018.05.043

Perepelitsa-Belancio, V., and Deininger, P. (2003). RNA truncation by premature polyadenylation attenuates human mobile element activity. Nat. Genet. 35, 363�. doi: 10.1038/ng1269

Prak, E. T., Dodson, A. W., Farkash, E. A., and Kazazian, H. H. Jr. (2003). Tracking an embryonic L1 retrotransposition event. Proc. Natl. Acad. Sci. U.S.A. 100, 1832�. doi: 10.1073/pnas.0337627100

Quentin, Y. (1992a). Fusion of a free left Alu monomer and a free right Alu monomer at the origin of the Alu family in the primate genomes. Nucleic Acids Res. 20, 487�. doi: 10.1093/nar/20.3.487

Quentin, Y. (1992b). Origin of the Alu family: a family of Alu-like monomers gave birth to the left and the right arms of the Alu elements. Nucleic Acids Res. 20, 3397�. doi: 10.1093/nar/20.13.3397

Redon, R., Ishikawa, S., Fitch, K. R., Feuk, L., Perry, G. H., Andrews, T. D., et al. (2006). Global variation in copy number in the human genome. Nature 444, 444�. doi: 10.1038/nature05329

Rinker, D. C., Specian, N. K., Zhao, S., and Gibbons, J. G. (2019). Polar bear evolution is marked by rapid changes in gene copy number in response to dietary shift. Proc. Natl. Acad. Sci U.S.A. 116, 13446�. doi: 10.1073/pnas.1901093116

Rissi, V. B., and Glanzner, W. G. (2019). The histone lysine demethylase KDM7A is required for normal development and first cell lineage specification in porcine embryos. Epigenetics 14, 1088�. doi: 10.1080/15592294.2019.1633864

Romero, V., Hosomichi, K., Nakaoka, H., Shibata, H., and Inoue, I. (2017). Structure and evolution of the filaggrin gene repeated region in primates. BMC Evol. Biol. 17:10. doi: 10.1186/s12862-016-0851-5

R࿍iger, N. S., Gregersen, N., and Kielland-Brandt, M. C. (1995). One short well conserved region of Alu-sequences is involved in human gene rearrangements and has homology with prokaryotic chi. Nucleic Acids Res. 23, 256�. doi: 10.1093/nar/23.2.256

Russanova, V. R., Driscoll, C. T., and Howard, B. H. (1995). Adenovirus type 2 preferentially stimulates polymerase III transcription of Alu elements by relieving repression: a potential role for chromatin. Mol. Cell. Biol. 15, 4282�. doi: 10.1128/MCB.15.8.4282

Saeliw, T., Tangsuwansri, C., Thongkorn, S., Chonchaiya, W., Suphapeetiporn, K., Mutirangura, A., et al. (2018). Integrated genome-wide Alu methylation and transcriptome profiling analyses reveal novel epigenetic regulatory networks associated with autism spectrum disorder. Mol. Autism 9:27. doi: 10.1186/s13229-018-0213-9

Sebat, J., Lakshmi, B., Troge, J., Alexander, J., Young, J., Lundin, P., et al. (2004). Large-scale copy number polymorphism in the human genome. Science 305, 525�. doi: 10.1126/science.1098918

Shaikh, T. H., Roy, A. M., Kim, J., Batzer, M. A., and Deininger, P. L. (1997). cDNAs derived from primary and small cytoplasmic Alu (scAlu) transcripts. J. Mol. Biol. 271, 222�. doi: 10.1006/jmbi.1997.1161

Shampay, J., and Blackburn, E. H. (1988). Generation of telomere-length heterogeneity in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. U.S.A. 85, 534�. doi: 10.1073/pnas.85.2.534

Shao, X., Lv, N., Liao, J., Long, J., Xue, R., Ai, N., et al. (2019). Copy number variation is highly correlated with differential gene expression: a pan-cancer study. BMC Med. Genet. 20:175. doi: 10.1186/s12881-019-0909-5

Sharif, J., Muto, M., Takebayashi, S., Suetake, I., Iwamatsu, A., Endo, T. A., et al. (2007). The SRA protein Np95 mediates epigenetic inheritance by recruiting Dnmt1 to methylated DNA. Nature 450, 908�. doi: 10.1038/nature06397

Shay, J. W., Reddel, R. R., and Wright, W. E. (2012). Cancer. Cancer and telomeres𠄺n ALTernative to telomerase. Science 336, 1388�. doi: 10.1126/science.1222394

Shibata, Y., Kumar, P., Layer, R., Willcox, S., Gagan, J. R., Griffith, J. D., et al. (2012). Extrachromosomal microDNAs and chromosomal microdeletions in normal tissues. Science 336, 82�. doi: 10.1126/science.1213307

Shirakawa, T., Yaman-Deveci, R., Tomizawa, S., Kamizato, Y., Nakajima, K., Sone, H., et al. (2013). An epigenetic switch is crucial for spermatogonia to exit the undifferentiated state toward a Kit-positive identity. Development 140, 3565�. doi: 10.1242/dev.094045

Singh, N., Baby, D., Rajguru, J. P., Patil, P. B., Thakkannavar, S. S., and Pujari, V. B. (2019). Inflammation and cancer. Ann. Afr. Med. 18, 121�. doi: 10.4103/aam.aam_56_18

Singh, N. N., and Lambowitz, A. M. (2001). Interaction of a group II intron ribonucleoprotein endonuclease with its DNA target site investigated by DNA footprinting and modification interference. J. Mol. Biol. 309, 361�. doi: 10.1006/jmbi.2001.4658

Sinnett, D., Richer, C., Deragon, J. M., and Labuda, D. (1992). Alu RNA transcripts in human embryonal carcinoma cells. Model of post-transcriptional selection of master sequences. J. Mol. Biol. 226, 689�. doi: 10.1016/0022-2836(92)90626-U

Sitte, N., Saretzki, G., and von Zglinicki, T. (1998). Accelerated telomere shortening in fibroblasts after extended periods of confluency. Free Rad. Biol. Med. 24, 885�. doi: 10.1016/S0891-5849(97)00363-8

Slamon, D. J. (1990). Studies of the HER-2/neu proto-oncogene in human breast cancer. Cancer Investig. 8:253. doi: 10.3109/07357909009017573

Smit, A. F. (1999). Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr. Opin. Genet. Dev. 9, 657�. doi: 10.1016/S0959-437X(99)00031-3

Smith, Z. D., Chan, M. M., Humm, K. C., Karnik, R., Mekhoubad, S., Regev, A., et al. (2014). DNA methylation dynamics of the human preimplantation embryo. Nature 511, 611�. doi: 10.1038/nature13581

Sobinoff, A. P., and Pickett, H. A. (2020). Mechanisms that drive telomere maintenance and recombination in human cancers. Curr. Opin. Genet. Dev. 60, 25�. doi: 10.1016/j.gde.2020.02.006

Soumillon, M., Necsulea, A., Weier, M., Brawand, D., Zhang, X., Gu, H., et al. (2013). Cellular source and mechanisms of high transcriptome complexity in the mammalian testis. Cell Rep. 3, 2179�. doi: 10.1016/j.celrep.2013.05.031

Spanò, M., Bonde, J. P., Hjøllund, H. I., Kolstad, H. A., Cordelli, E., and Leter, G. (2000). Sperm chromatin damage impairs human fertility. The Danish First Pregnancy Planner Study Team. Fertil. Steril. 73, 43�. doi: 10.1016/S0015-0282(99)00462-8

Su, Y., Davies, S., Davis, M., Lu, H., Giller, R., Krailo, M., et al. (2007). Expression of LINE-1 p40 protein in pediatric malignant germ cell tumors and its association with clinicopathological parameters: a report from the Children's Oncology Group. Cancer Lett. 247, 204�. doi: 10.1016/j.canlet.2006.04.010

Sudmant, P. H., Mallick, S., Nelson, B. J., Hormozdiari, F., Krumm, N., Huddleston, J., et al. (2015). Global diversity, population stratification, and selection of human copy-number variation. Science 349:aab3761. doi: 10.1126/science.aab3761

Swinburne, I. A., Miguez, D. G., Landgraf, D., and Silver, P. A. (2008). Intron length increases oscillatory periods of gene expression in animal cells. Genes Dev. 22, 2342�. doi: 10.1101/gad.1696108

Taggart, A. J., DeSimone, A. M., Shih, J. S., Filloux, M. E., and Fairbrother, W. G. (2012). Large-scale mapping of branchpoints in human pre-mRNA transcripts in vivo. Nat. Struct. Mol. Biol. 19, 719�. doi: 10.1038/nsmb.2327

Takahashi, K., and Yamanaka, S. (2006). Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 126, 663�. doi: 10.1016/j.cell.2006.07.024

Tang, F., Barbacioru, C., Bao, S., Lee, C., Nordman, E., Wang, X., et al. (2010). Tracing the derivation of embryonic stem cells from the inner cell mass by single-cell RNA-Seq analysis. Cell Stem Cell 6, 468�. doi: 10.1016/j.stem.2010.03.015

Tirado, E. E., Rivnay, B., Marquette, M. L., Bourque, V., Araneo, C., and Leader, B. S. (2009). Age is highly correlated with oxidative damage in sperm from infertile males. Fertil. Steril. 92, S220–S221. doi: 10.1016/j.fertnstert.2009.07.1524

Vanneste, E., Voet, T., Le Caignec, C., Ampe, M., Konings, P., Melotte, C., et al. (2009). Chromosome instability is common in human cleavage-stage embryos. Nat. Med. 15, 577�. doi: 10.1038/nm.1924

Vatsavayai, S. C., Dallérac, G. M., Milnerwood, A. J., Cummings, D. M., Rezaie, P., Murphy, K. P., et al. (2007). Progressive CAG expansion in the brain of a novel R6/1�Q mouse model of Huntington's disease with delayed phenotypic onset. Brain Res. Bull. 72, 98�. doi: 10.1016/j.brainresbull.2006.10.015

Villasante, A., de Pablos, B., Méndez-Lago, M., and Abad, J. P. (2008). Telomere maintenance in Drosophila: rapid transposon evolution at chromosome ends. Cell Cycle 7, 2134�. doi: 10.4161/cc.7.14.6275

Wagner, T. A., McLaughlin, S., Garg, K., Cheung, C. Y., Larsen, B. B., Styrchak, S., et al. (2014). HIV latency. Proliferation of cells with HIV integrated into cancer genes contributes to persistent infection. Science 345, 570�. doi: 10.1126/science.1256304

Wallace, N., Wagstaff, B. J., Deininger, P. L., and Roy-Engel, A. M. (2008). LINE-1 ORF1 protein enhances Alu SINE retrotransposition. Gene 419, 1𠄶. doi: 10.1016/j.gene.2008.04.007

Wan, R., Bai, R., Yan, C., Lei, J., and Shi, Y. (2019). Structures of the catalytically activated yeast spliceosome reveal the mechanism of branching. Cell 177, 339�.e313. doi: 10.1016/j.cell.2019.02.006

Wang, E. T., Sandberg, R., Luo, S., Khrebtukova, I., Zhang, L., Mayr, C., et al. (2008). Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470�. doi: 10.1038/nature07509

Wang, Y., Bernhardy, A. J., Nacson, J., and Krais, J. J. (2019). BRCA1 intronic Alu elements drive gene rearrangements and PARP inhibitor resistance. Nat. Commun. 10:5661. doi: 10.1038/s41467-019-13530-6

Wei, M., Grushko, T. A., Dignam, J., Hagos, F., Nanda, R., Sveen, L., et al. (2005). BRCA1 promoter methylation in sporadic breast cancer is associated with reduced BRCA1 copy number and chromosome 17 aneusomy. Cancer Res. 65, 10692�. doi: 10.1158/0008-5472.CAN-05-1277

Wöhrle, D., Hirst, M. C., Kennerknecht, I., Davies, K. E., and Steinbach, P. (1992). Genotype mosaicism in fragile X fetal tissues. Hum. Genet. 89, 114�. doi: 10.1007/BF00207057

Woodcock, D. M., Lawler, C. B., Linsenmeyer, M. E., Doherty, J. P., and Warren, W. D. (1997). Asymmetric methylation in the hypermethylated CpG promoter region of the human L1 retrotransposon. J. Biol. Chem. 272, 7810�. doi: 10.1074/jbc.272.12.7810

Xia, B., Yan, Y., Baron, M., Wagner, F., Barkley, D., Chiodin, M., et al. (2020). Widespread transcriptional scanning in the testis modulates gene evolution rates. Cell 180, 248�.e221. doi: 10.1016/j.cell.2019.12.015

Yu, Z., Zhu, Y., Chen-Plotkin, A. S., Clay-Falcone, D., McCluskey, L., Elman, L., et al. (2011). PolyQ repeat expansions in ATXN2 associated with ALS are CAA interrupted repeats. PLoS ONE 6:e17951. doi: 10.1371/journal.pone.0017951

Yulug, I. G., Yulug, A., and Fisher, E. M. (1995). The frequency and position of Alu repeats in cDNAs, as determined by database searching. Genomics 27, 544�. doi: 10.1006/geno.1995.1090

Zhang, Y., Zhang, X. O., Chen, T., Xiang, J. F., Yin, Q. F., Xing, Y. H., et al. (2013). Circular intronic long noncoding RNAs. Mol. Cell 51, 792�. doi: 10.1016/j.molcel.2013.08.017

Zheng, Y. H., Lovsin, N., and Peterlin, B. M. (2005). Newly identified host factors modulate HIV replication. Immunol. Lett. 97, 225�. doi: 10.1016/j.imlet.2004.11.026

Zhu, P., Guo, H., Ren, Y., and Hou, Y. (2018). Single-cell DNA methylome sequencing of human preimplantation embryos. Nat. Genet. 50, 12�. doi: 10.1038/s41588-017-0007-6

Keywords: copy number variation, transcription, evolution, embryonic development, senescence, oncogenesis, homologous recombination, retrotransposons

Citation: Sui Y and Peng S (2021) A Mechanism Leading to Changes in Copy Number Variations Affected by Transcriptional Level Might Be Involved in Evolution, Embryonic Development, Senescence, and Oncogenesis Mediated by Retrotransposons. Front. Cell Dev. Biol. 9:618113. doi: 10.3389/fcell.2021.618113

Received: 16 October 2020 Accepted: 11 January 2021
Published: 11 February 2021.

Valentina Massa, University of Milan, Italy

Pedro P. Rocha, Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD), United States
Fang Bai, Nankai University, China

Copyright © 2021 Sui and Peng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.