Previous Article | Next Article ![]()
Journal of Virology, October 2008, p. 9964-9977, Vol. 82, No. 20
0022-538X/08/$08.00+0 doi:10.1128/JVI.01299-08
Copyright © 2008, American Society for Microbiology. All Rights Reserved.

Beihua Dong,5
David Boren,2
Serena A. Lee,2
Jaydip Das Gupta,5
Christina Gaughan,5
Eric A. Klein,6
Christopher Lee,3
Robert H. Silverman,5 and
Samson A. Chow1,2,3,4*
Biomedical Engineering Interdepartmental Program,1 Department of Molecular and Medical Pharmacology,2 Molecular Biology Institute,3 UCLA AIDS Institute, UCLA School of Medicine, Los Angeles, California 90095,4 Department of Cancer Biology, Lerner Research Institute,5 Glickman Urological and Kidney Institute, Cleveland Clinic, Cleveland, Ohio 441956
Received 22 June 2008/ Accepted 3 August 2008
|
|
|---|
|
|
|---|
The association of RNASEL mutations with prostate cancer suggests that inherited defects of RNase L may enhance susceptibility to infectious agents, leading to tumorigenesis. Testing this hypothesis led to the identification of a new human retrovirus, xenotropic murine leukemia virus (MLV)-related virus (XMRV), in prostate cancer patients with the QQ variant of RNASEL (101). XMRV was detected in 40% (8 of 20) of the prostate cancers from QQ patients, compared with 1.5% among heterozygous (RQ) and wild-type (RR) patients (1 of 66). XMRV is 8,185 nucleotides in length, harbors no host-derived oncogenes, and shares up to 95% overall nucleotide sequence identity with known MLVs (101). A molecular clone of XMRV capable of infecting human prostate and nonprostate cell lines has been constructed (28). Replication of the cloned virus is sensitive to IFN-β inhibition, and RNase L is required for a complete IFN antiviral response, both findings consistent with the observation that XMRV is associated with patients having the QQ genotype (28). Expression of the human cell surface receptor XPR1 (xenotropic and polytropic retrovirus receptor 1) is required for XMRV infection, implicating XPR1 as an XMRV receptor.
Retroviruses that do not carry oncogenes, such as avian leukosis virus and Moloney MLV, usually induce tumors in their susceptible host animals by proviral insertional mutagenesis, in which proto-oncogenes are activated via promoter or enhancer insertion as a consequence of integrating the viral DNA genome into the host cell chromosome (65). Previous studies showed that most of the host genome is accessible for retroviral integration but that target site selection is not random (67, 86, 108). Furthermore, the viruses studied thus far for their positions of integrated provirus in the human genome show different patterns of target site preference and can be divided into three groups (27, 67). Human immunodeficiency virus type 1 (HIV-1), simian immunodeficiency virus, feline immunodeficiency virus, and equine infectious anemia virus form one group and share a common feature of integrating predominantly within transcription units (25, 38, 45, 48, 86). The second group comprises MLV, porcine endogenous retrovirus, and foamy virus, and their integration favors transcription start sites or CpG islands (67, 69, 73, 100, 108). The third group consists of human T-cell leukemia virus type 1 (HTLV-1), avian sarcoma-leukosis virus, and mouse mammary tumor virus. They have the most random distribution of integration sites and show only a slight preference for transcription units, transcription start sites, or CpG islands (5, 27, 34, 67, 71). The preference for transcription start sites may contribute to the observation that MLV-based vectors are more prone to activate proto-oncogenes via insertional mutagenesis than HIV-based vectors (3, 70, 88). Therefore, integration site preference may have important significance for the potential impact of a retrovirus on its host and the safety of retrovirus-based vectors in gene therapy approaches. This concern is poignantly demonstrated by the subsequent development of leukemia in three children with X-linked severe combined immunodeficiency after an otherwise successful gene therapy trial by use of an MLV-derived vector (36, 37). Analysis of leukemic cells from two patients found integration of the vector in the 5' region of the LMO2 oncogene.
The high frequency of XMRV detection in prostate cancer from QQ patients and the validation of XMRV as a bona fide human retrovirus (28, 101) raised the possibility that XMRV might be involved in prostate cancer formation. If the initiation and progression of prostate cancer are affected by the ability of RNase L to suppress XMRV replication, the virus could contribute to the geographical prevalence of the disease in developed countries. Involvement of a viral mechanism may also partly explain the morphological and multifocal heterogeneities that distinguish prostate cancer from other common cancers. To determine the integration site preference of XMRV and the potential risk of proviral insertional mutagenesis, we carried out a genome-wide analysis of viral integration sites in a prostate cell line after an acute XMRV infection. In comparison with that of MLV and two human retroviruses, HIV-1 and HTLV-1, integration of XMRV shows a strong preference for transcription start sites, CpG islands, gene-dense regions, and DNase-hypersensitive sites. In prostate cancer tissues, in addition to the aforementioned chromosomal features, XMRV integration sites are associated with frequent cancer breakpoints, common fragile sites, microRNA (miRNA), and cancer-related genes. These associations in prostate cancer tissues may represent a selection event for particular XMRV integration sites and suggest that XMRV may play a role in prostate cancer development.
|
|
|---|
XMRV RNA and DNA determination by PCR.
Peripheral zones from tumor-bearing prostate tissues were frozen at the time of surgery and stored at –80°C. RNA or DNA was isolated from frozen tissue by using Trizol (Invitrogen) or a QIAamp DNA mini kit (Qiagen), respectively, following manufacturer's instructions. XMRV gag sequences were detected using nested reverse transcriptase PCR (RT-PCR), nested PCR, or quantitative RT-PCR. For nested RT-PCR, the first-strand cDNA was synthesized with 1.5 µg of total RNA by using an iScript Select cDNA synthesis kit (Bio-Rad) containing random hexamer oligonucleotides as a primer. Nested PCR on cDNA (1/10 of the total) or on genomic DNA (
0.5 µg) was performed with duplicates for the detection of gag sequences as described previously, with modifications (the first round of PCR was at 52°C for 35 cycles; the second round of PCR was at 54°C for 35 cycles) (101). Platinum Taq polymerase (Invitrogen) was used for nested PCR. Two-step quantitative RT-PCR was performed with total RNA (1.5 µg) and the Q528R primer (28), using the iScript Select cDNA synthesis kit. Applied Biosystems TaqMan universal PCR master mix was used for the quantitative PCR mixture, containing 900 nM each of Q445T and Q528R and 250 nM of TaqMan probe, and PCR was performed with an Applied Biosystems 7500 instrument (28). Reaction mixtures were incubated at 50°C for 2 min (for optimal AmpErase UNG activity), followed by incubation at 95°C for 10 min (for deactivation of AmpErase UNG and activation of AmpliTaq Gold). The cycling conditions were 50 cycles at 95°C for 15 s and 60°C for 1 min. Known copy numbers of XMRV RNA were used as standards.
Cloning XMRV integration sites.
The assay for determining XMRV integration sites in DU145 cells was similar to that described previously for HIV-1 (48). Briefly, genomic DNA from XMRV-infected DU145 cells was digested with PstI, which cuts once in the XMRV genome at nucleotide position 7150 and produces on average 4-kbp DNA fragments. After digestion, DNA was denatured and annealed with a biotinylated primer, bXMRV7550 (5'-biotin-ATCCTACTCTTCGGACCCTGT), which is complementary to nucleotide positions 7550 to 7571 within the env gene, about 140 bp upstream of the right long terminal repeat (LTR). The annealed primer was then extended using the PicoMaxx high-fidelity PCR system (Stratagene) to produce biotinylated double-stranded DNA containing the virus-human DNA junction region (Int-DNA). The Int-DNA was isolated by binding to streptavidin-agarose Dynabeads (Dynal) and digested with Taq
I (5'-T
CGA), a 4-bp cutter that does not cleave the vial DNA portion of the Int-DNA and produces on average 250-bp DNA fragments. After digestion, the Int-DNA was ligated with TaqLinker, which was prepared by annealing BHLinkAl (5'-CGGATCCCGCATCATATCTCCAGGTGTGACAGTTT) with TaqLinkS (5'-CACCTGGAGATATGATGCGGGATC). The TaqLinker contains a 2-nucleotide 5' overhang (in bold type) complementary with the Taq
I-digested Int-DNA. The linker-ligated Int-DNA was amplified by a two-step PCR. The first PCR was carried out using primers XMRV8027F (5'-AACCAATCAGCTCGCTTCTC) and Linker1 (5'-TAACTGTCACACCTGGAGATA) in a final volume of 300 µl with 0.5 µM of each primer, 0.2 mM deoxynucleoside triphosphates, and 12 U Pfu DNA polymerase (Stratagene) under the following conditions: 2 min of preincubation at 94°C, followed by 29 cycles at 94°C for 30 s, 58°C for 30 s, and 72°C for 4 min. The PCR product was purified using a PCR purification kit (Qiagen) and was used as the template for the second PCR, which used two nested primers, XMRV8147F (5'-CGGGTACCCGTGTTCCCAATA) and Linker2 (5'-TAGATATGATGCGGGATCCG), which anneal downstream of the XMRV8027F and Linker1 binding sites, respectively. The conditions for the second PCR were identical to those for the first PCR, except the second PCR was conducted with only 18 cycles. The second PCR product was electrophoresed on a 1.5% agarose gel, and diffused bands between 200 bp and 2 kbp were extracted using a gel extraction kit (Qiagen). Extracted DNA was cloned into a pCR-Blunt vector by using a Zero Blunt PCR cloning kit (Invitrogen).
Cloning of integration sites from human prostate cancer tissues was similar to the procedure described earlier for DU145 cells, except that the virus-host DNA junction was first subjected to linear amplification. This was necessary because only
1% of prostate cells from homozygous QQ patients showed XMRV infection (101). Ten micrograms of genomic DNA isolated from prostate peripheral zones (containing tumor cells) as described previously (28) was mixed with 50 pM bXMRV7550, 0.2 mM deoxynucleoside triphosphates, and PicoMaxx polymerase (Stratagene) under the following conditions: 3 min of preincubation at 94°C, followed by 80 cycles at 94°C for 30 s, 58°C for 30 s, and 72°C for 4 min. Two units of fresh PicoMaxx enzyme was added to the reaction mixture after 40 cycles. Biotinylated DNA was isolated from the PCR product by binding to 200 mg of streptavidin-agarose Dynabeads, and the remaining procedure was identical to that described earlier for XMRV-infected DU145 cells.
Sequence analysis and mapping integration sites. The sequence of the cloned DNA was determined by dideoxy sequencing, and sequencing ambiguities were resolved by repeated sequencing on both strands. The authenticity of the integration site sequence was verified by the following criteria: (i) the sequence contained both the XMRV LTR and the linker sequence, (ii) a match to the human genome started after the end of the LTR (5'-...CA-3') and ended with the linker sequence, and (iii) the host DNA region (containing 20 or more nucleotides) from the putative integration site sequence showed 96% or greater identity to the human genomic sequence. The authenticated integration site sequences were then mapped to the human genome hg18 (UCSC March 2006 freeze; NCBI build 36.1) by using the BLASTN program (http://www.ensembl.org/index.html) or BLAT (UCSC; http://genome.ucsc.edu/). Transcription units in the vicinity of the integration sites were identified using the RefSeq gene database (NCBI Reference Sequence Project; http://www.ncbi.nih.gov/RefSeq/). Similarities to repetitive sequences were analyzed as described previously (48). All the genomic feature data sets were downloaded from the UCSC genome database (http://genome.ucsc.edu/cgi-bin/hgTracks).
Statistical analysis of integration site sequences. To determine integration site selection bias, a comparative set of 10,000 random positions in the human genome were generated in silico by choosing random numbers between 1 and 3,093,120,360, which represents the total length of the 22 autosomal chromosomes plus the X and Y sex chromosomes. To test for differences in proportions, we used r x c contingency table analysis (by Fisher's exact test when individual cell counts were small [<10] or by chi-square approximation). To test for equality of distribution, we used the two-sample Kolmogorov-Smirnov test.
Nucleotide sequence accession numbers. The GenBank accession numbers for integration site sequences from DU145 cells and prostate cancer tissues are EU981292 to EU981799 and EU981800 to EU981813, respectively.
|
|
|---|
1 copy/100 cells (28). The amplified junction sequence was cloned and sequenced. The authenticity of the sequence was then verified and the location mapped to the human genome. We sequenced a total of 508 authentic XMRV integration sites from DU145 cells, and 472 of these sites were mapped to unique locations in the human genome. Integration events were found in all 24 human chromosomes (22 autosomes and the sex chromosomes X and Y) (Fig. 1). The frequencies of integration of XMRV were generally proportional to chromosome size, but the overall frequency of XMRV integration into human chromosomes was different from that of uniformly random integration (P < 0.0001). Notably, chromosomes 1, 17, and 19 were significantly overrepresented (P values of 0.0015, 0.0021, and <0.0001, respectively), while chromosomes 5, 13, and X were significantly underrepresented (P = 0.0081, 0.0099, and 0.0002, respectively). Different integration frequencies among the different human chromosomes have also been observed for other retroviruses (38, 55, 67, 71, 73, 86). Additionally, using the criteria previously defined for integration hot spot (86), which is three or more integrations within a 100-kbp region, we identified four integration hot spots for XMRV (Table 1).
![]() View larger version (67K): [in a new window] |
FIG. 1. Positions of XMRV integration sites in the human genome. The human chromosomes are shown numbered. Centromere locations are denoted by chromosomal indentations. Sites of XMRV integration in DU145 cells are indicated as red vertical lines along the top, and XMRV integration sites in prostate cancer tissues are indicated as blue "lollipops" on the bottom. Within each chromosome, the top bar shows the relative densities of RefSeq genes, with higher gene-dense regions shown as a more intense cyan. The second bar shows the chromosome cytobands. The third bar shows the cancer breakpoints, and the frequencies of breakpoints in different chromosomal regions are denoted by different colors (see the key at the bottom right-hand corner). The green shading in the bottom bar denotes the locations of common fragile sites.
|
|
View this table: [in a new window] |
TABLE 1. Integration hotspots of XMRV
|
|
View this table: [in a new window] |
TABLE 2. Integration site datasets used in this study
|
|
View this table: [in a new window] |
TABLE 3. Genomic features associated with retroviral integration sitesa
|
Integration of MLV also showed a weak tendency for favoring active genes (67, 108). To examine the effect of transcriptional activity on XMRV integration, the expression levels of genes whose transcription start sites are closest to and within 10 kbp of an integration site were analyzed using transcriptional profiling of DU145 cells (GSM133589; NCBI Gene Expression Omnibus). The results showed that the percentage of XMRV integration increased proportionally with increasing gene activity (P = 0.013) (Fig. 2).
![]() View larger version (18K): [in a new window] |
FIG. 2. Frequency of XMRV integration as a function of gene activity. All the genes in the transcription profile data of DU145 cells (GSM133589; NCBI Gene Expression Omnibus) were ranked by their relative levels of expression and distributed into five "bins" according to their levels of expression (x axis). The leftmost bin contains genes with the lowest expression levels, and the rightmost bin contains those with the highest. Genes whose transcription start sites were closest to and within 10 kbp of the XMRV integration site (solid line) or the random control (dotted line) were then distributed into the same bins based on their expression levels and summed and the values expressed as the percentages of all genes in the indicated bin (y axis). The chi-square test was used to compare the trend to the null hypothesis of no bias due to expression level (67).
|
Integration frequency near and within transcription units. Although integrations of MLV, HIV-1, and HTLV-1 all favor transcription units, MLV and HTLV-1 integrate preferentially near the start of transcriptional units, whereas HIV-1 integrates throughout the entire length of the transcriptional region (27, 67, 86, 108). To examine the preference of XMRV integration sites within and near transcription units, we normalized the lengths of all RefSeq genes by dividing the genes into 10 bins and dividing 40-kbp regions upstream and downstream of the genes into 5-kbp windows. The relative integration frequency was calculated by dividing the number of integration sites in each bin or window by that in the random control (Fig. 3A). The preferences for integration of MLV, HIV-1, and HTLV-1 into transcription units and intergenic regions were consistent with previous reports (27, 67, 86, 108). In comparison to those of the other retroviruses examined, the integration pattern of XMRV was most similar to that of MLV. The distribution curve for frequency of integration site was bell shaped, with the peak centered near the transcription start site (Fig. 3A).
![]() View larger version (40K): [in a new window] |
FIG. 3. Integration site distribution of XMRV and other retroviruses. (A) Integration intensity within and near transcription units. All RefSeq genes, demarcated by transcription start site (TSS) and transcription termination (TT), were normalized to a common length and then divided into 10 bins (shaded area) to allow comparison. Chromosomal regions up to a distance of 40 kbp upstream and downstream of the transcription unit were divided into 5-kbp windows. The number of integration sites in each bin or 5-kbp window was divided by the number of random control sites in the same bin or window and the value plotted. A value of 1 indicates no difference between the experimental sites and the random control. (B) Integration frequency near transcription start sites. Chromosomal regions within ±12 kbp of the RefSeq transcription start site were divided into 2-kbp windows. The distance upstream of the TSS is denoted by the minus sign. The numbers of integration sites of various retroviruses or random sites in each 2-kbp window were determined and expressed as percentages of the total integration sites.
|
Integration frequency near genomic features associated with gene regulatory regions. The strong preference of XMRV integration near transcription start sites prompted us to examine other genomic features that are frequently associated with gene regulatory regions, such as CpG islands, DNase-hypersensitive sites, and transcription factor-binding sites. CpG islands are regions (at least 200 bp) rich in the CpG dinucleotide and thought to be involved in transcription regulation of nearby genes (9, 54). Previous results have shown that MLV integration favors CpG islands, whereas HIV-1 disfavors CpG islands, and HTLV-1 has no preference (27, 67, 86, 108). In our analysis, it was evident that integration of both XMRV and MLV strongly favored CpG islands (Fig. 4A). Within the ±2-kbp window of the CpG island midpoint, the percentages of XMRV and MLV integration sites were 27.8% and 17.9%, respectively. The preference of XMRV for CpG islands was significantly stronger than that of MLV (P < 0.0001). Compared with the random control (3.2%), HIV-1 (2.5%; P = 0.0440) weakly disfavored CpG islands, whereas HTLV-1 (4.9%; P = 0.0177) showed a moderate preference.
![]() View larger version (19K): [in a new window] |
FIG. 4. Integration frequencies of XMRV and other retroviruses near CpG islands, DNase-hypersensitive sites, and transcription factor-binding sites. (A) CpG islands. Chromosomal regions within ±12 kbp of the CpG island were divided into 2-kbp windows. The distance upstream of the CpG island is denoted by the minus sign. The numbers of integration sites of various retroviruses or random sites in each 2-kbp window were determined and expressed as percentages of the total integration sites. (B) DNase-hypersensitive sites. Random sites or integration sites of the indicated retroviruses within ±1 kbp of DNase-hypersensitive sites were determined and the values expressed as percentages of the total integration sites. (C) Transcription factor-binding sites. Random sites or integration sites of the indicated retroviruses within ±1 kbp of known transcription factor-binding sites (NCBI) were determined and the values expressed as percentages of the total integration sites.
|
Retroviral integration targeting has also been linked to transcription factors. For instance, binding sites for the AP-1 and Bach1 transcription factors are enriched near MLV integration sites (58), and HIV-1 integration site pattern is altered by the absence or presence of the LEDGF/p75 transcription factor (21, 90). We analyzed the frequencies of integration sites of different retroviruses within ±1 kbp of known transcription factor-binding sites (Fig. 4C), and the percentages of integration sites for XMRV, MLV, HIV-1, and HTLV-1 were 62.1%, 68.0%, 51.1%, and 55.5%, respectively. All viruses showed a significant preference for transcription factor-binding sites compared to the random control (41.3%; P < 0.0001). Statistical analysis linking the preference of retroviral integration to individual transcription factors was not possible, due to the limited number of integration sites against a large array of transcription factors. However, we did notice that the percentages of XMRV integration sites within ±1 kbp of the binding sites for transcription factors Bach2, MZF-1, and NF-E2 p45 were five- to sixfold higher than those of the random control.
To better understand the preferences of XMRV integration for transcription start sites, CpG islands, and DNase-hypersensitive sites, we generated three data sets and constructed Venn diagrams to examine their relationships (Fig. 5). One data set contained integration sites within ±2 kbp of transcription start sites, one contained integration sites within 2 kbp of CpG islands, and one contained integration sites within 1 kbp of DNase-hypersensitive sites. For each data set representing one genomic feature, we determined the extents of the presence of the other two genomic features. As expected from the earlier results (Fig. 4 and 5), the percentages of integration sites in the three data sets for XMRV and MLV were high and accounted for 23.7% and 16.4%, respectively, of the total integration sites. These values were significantly higher (P < 0.0001) than those for the random control (4.5%), HIV-1 (3.9%), and HTLV-1 (6.5%). The sums of overlap areas involving any two or all three data sets for XMRV and MLV were 89.3% and 90.3%, respectively, of the total area, and were significantly higher (P < 0.0001) than those for the random control (34.9%), HIV-1 (27.6%), and HTLV-1 (61.9%). The results suggested that XMRV and MLV integrate preferentially into chromosomal regions containing two or more of the following genomic features: transcription start sites, CpG islands, and DNase-hypersensitive sites.
![]() View larger version (24K): [in a new window] |
FIG. 5. Venn diagrams of the relationship among transcription start sites, CpG islands, and DNase-hypersensitive sites in affecting retroviral integration site preference. Red circles represent integration sites within ±2 kbp of transcription start sites (TSS), green circles represent integration sites within ±2 kbp of CpG islands (CpG), and purple circles represent integration sites within ±1 kbp of DNase-hypersensitive sites (DH). The combined numbers of integration sites in the three data sets for the random control, XMRV, MLV, HIV-1, and HTLV-1 were 4.5%, 3.9%, 6.5%, 23.7%, and 16.4%, respectively, of the total integration sites. Yellow, pink, and blue denote integration sites that contain TSS and CpG, TSS and DH, and CpG and DH, respectively. White denotes integration sites that contain TSS, CpG, and DH. The percentage of overlap areas involving any two or all three data sets is indicated at the top right corner of each panel.
|
10%, representing a random distribution of sites into the 10 bins. For breakpoint bins with frequencies of breakpoints ranging from 55 to 687, the percentages of retroviral integration sites were largely similar to or less than those for the random control (Fig. 6A). In the two bins with the highest breakpoint frequencies (701 to 1,013 and 1,043 to 5,543), with the exception of HTLV-1 (P = 0.0525) in the bin with the highest breakpoint frequency, the percentages of integration sites for all retroviruses tested were significantly higher than those for the random control (P < 0.0001). However, none of the retroviruses tested had a percentage of integration sites significantly higher than the percentage of the RefSeq genes in the corresponding breakpoint bin, suggesting that the increase over the random control was likely a result of the preference for transcription units rather than breakpoints. A similar result was obtained when the analysis was carried out using the Mitelman recurrent-breakpoint database (data not shown).
![]() View larger version (29K): [in a new window] |
FIG. 6. Integration frequencies of XMRV and other retroviruses near cancer breakpoints, common fragile sites, and miRNA genes. (A) Integration near cancer breakpoints. All the known cancer breakpoints (NCBI Genetics Review) within each of the 320 chromosome subbands were counted, and the subbands were ranked according to the frequencies of breakpoints and divided into 10 bins (x axis). The percentage of random sites, the percentage of RefSeq genes, and the percentages of integration sites of different retroviruses within each of the 10 breakpoint bins were then determined. (B) Common fragile sites. The percentages of random sites, RefSeq genes, and integration sites of the indicated retroviruses within common fragile sites (NCBI Genome database) were calculated. (C) miRNA genes. The percentages of random sites, RefSeq genes, and integration sites of the indicated retroviruses within ±2 Mbp of miRNA genes were calculated.
|
Human miRNA genes are frequently located in fragile sites and genomic regions involved in cancers (14). Additionally, miRNAs are targets for chromosome deletion and exhibit high frequencies of genomic alterations in humans (110). Altered expression of miRNAs has been documented to occur in many tumors, including prostate cancers (62, 78). We calculated the percentages of integration of the various retroviruses within ±2 Mbp of miRNA genes and compared them with that for the random control (27.5%) (Fig. 6C). The integration frequencies of all retroviruses tested were significantly higher than that of the random control (P < 0.001) (Fig. 6C). However, as with cancer breakpoints and common fragile sites, none of the retroviruses tested had a percentage of integration higher than the percentage of RefSeq genes within ±2 Mbp of miRNA genes (44.8%) (Fig. 6C).
Analysis of XMRV integration sites in human prostate cancers. We also cloned and sequenced XMRV integration sites in prostate cancer tissues from nine different prostate cancer patients. These prostate tissue samples were selected for integration site mapping based on the detection of XMRV gag sequences by nested PCR, nested RT-PCR, or quantitative RT-PCR (data not shown). The RNASEL genotype of one prostate tissue sample, VP229, was the homozygous wild type (RR), and those of the other eight samples were the homozygous variant (QQ). From these patient samples, we cloned and sequenced a total of 14 authentic integration sites, ranging from 1 to 3 integration sites identified per positive patient sample (Table 4). Each integration site was sequenced at least twice in different experimental settings. The 14 integration sites from patient samples were referred to as XMRV-prostate cancer (XMRV-PC) to distinguish them from those obtained from acute infection of DU145 cells. The chromosomal distribution of the XMRV-PC integrations sites in the human genome was mostly unremarkable, except that three independent integrations from three different patient samples were located within 1.1 Mbp in the same cytoband, q22, on chromosome 16 (Fig. 1 and Table 4). No integration hot spot was found in this region during acute infection of DU145 cells (Table 1).
|
View this table: [in a new window] |
TABLE 4. Characteristics of XMRV integration sites identified in human prostate cancer tissues
|
0.002) (Fig. 7). However, due to the relatively small sample size, the percentages of XMRV-PC integration sites associated with transcription units and transcription factor-binding sites were not significantly different from that of the random control (P = 0.265 and 0.103, respectively).
![]() View larger version (20K): [in a new window] |
FIG. 7. Chromosomal features associated with XMRV integration sites identified in human prostate cancers. A total of 14 authentic XMRV integration sites from nine prostate cancer samples (XMRV-PC, black bars) were sequenced and mapped. The percentages of integration sites associated with the indicated chromosomal features were determined. The percentages of random sites (blue bars) and XMRV integration sites from DU145 cells (red bars) for each of the indicated chromosomal features were also included for comparison. CpG, CpG islands; DH, DNase-hypersensitive sites; TF, transcription factor-binding sites; TSS, transcription start sites.
|
Integration sites from prostate cancer tissues are associated with cancer breakpoints, common fragile sites, and miRNAs. As in acute infection, we analyzed the association between XMRV-PC integration sites and cancer cytogenetics or miRNA (Fig. 8). For cancer breakpoints, distribution of the XMRV-PC integration sites was biased toward regions with high breakpoint frequencies: 50.0% of XMRV-PC integration sites were located in the bin with the highest frequency (1,043 to 5,543), and an additional 21.4% were in the bin with the second-highest frequency (701 to 1,013) (Fig. 8A). Statistical analysis of integration events in the bin with the highest breakpoint frequency showed that the percentage of XMRV-PC integration sites was significantly higher than that for the random control (P = 0.0013). However, due to the relatively small sample size, the percentage of XMRV-PC integration sites was not significantly different from that of the XMRV integration sites from acutely infected cells (P = 0.133).
![]() View larger version (19K): [in a new window] |
FIG. 8. Association of XMRV integration sites in prostate tumor samples with cancer breakpoints, common fragile sites, and miRNA genes. The percentages of random sites, XMRV integration sites from the prostate cancer samples, and XMRV integration sites from DU145 cells are represented by blue, black, and red bars, respectively. (A) Cancer breakpoints. The 10 cancer breakpoint bins were ranked as described in the legend to Fig. 6. (B) Common fragile sites. (C) miRNA genes.
|
In addition to cancer cytogenetics, the percentage of XMRV-PC integration sites within ±2 Mbp of miRNA genes was analyzed (Fig. 8C). The percentage of XMRV-PC integration sites near miRNA was 78.6%, which was significantly higher than those for the random control (27.5%; P < 0.0001) and XMRV integration sites from acutely infected cells (42.6%; P < 0.0114).
|
|
|---|
The mechanistic basis and determinants of target site selection during retroviral DNA integration are still not fully known, but interactions between viral proteins and cellular factors are likely involved (21, 90). HIV-1 integrase is a major factor in controlling target site selection in vitro (4, 11, 40) and has been implicated as a principal determinant of integration specificity in vivo (58). The retroviruses studied thus far can be divided into three groups according to their preferences for transcription units and CpG islands (27, 67). Phylogenetic analysis showed that the three types of retroviral integration profiles correlate with the amino acid sequences of integrase and the lengths of target site duplication (27), which presumably correspond to the spacing on the target DNA between the two viral DNA ends during integrative recombination catalyzed by integrase (12). This correlation is consistent with the idea that integrase is the major determinant of integration site selection (27, 58). Based on integration site preference and the sequence identity of integrase, XMRV is most suited in the group that comprises MLV, porcine endogenous retrovirus, and foamy virus (27, 69). In terms of host determinants, the LEDGF/p75 transcription factor is a critical targeting factor during HIV-1 integration (21, 90). Transcription factor-binding sites are also favored during XMRV integration, but the involvement of a specific transcription factor for XMRV integration targeting cannot be ascertained at present. Further experiments are needed to confirm the role of integrase and to deduce the host determinants as well as the mechanisms responsible for the integration site preference observed for XMRV.
Comparisons between XMRV and MLV integration in human cells indicated that XMRV has a stronger preference for transcription start sites, CpG islands, DNase-hypersensitive sites, and gene-dense regions. The quantitative difference between XMRV and MLV may be due to the uses of different human cell lines (DU145 versus HeLa). However, analyses of HIV-1 and MLV integration indicate little or no dependency of site preference on cell types (41, 67). Alternatively, given that XMRV has evolved as a xenotropic murine retrovirus in the human population for some time and presumably has adapted to replicate in human cells, XMRV may form different interactions with human host factors than MLV even though the two viruses share a high sequence identity. Compared with that for HIV-based vectors, the higher integration preference of MLV toward transcription start sites and CpG islands has been attributed as a factor contributing to the higher genotoxicity observed with MLV-derived vectors (70). Among all retroviruses analyzed, the finding that XMRV shows the strongest preference for transcription start sites, gene-dense regions, and other features associated with open, active chromosomal regions suggests that XMRV integration may carry a significant risk, and a direct assessment of the oncogenic potential of XMRV infection is warranted.
In prostate tumor samples, analysis of XMRV integration sites also showed a preference for transcription start sites, CpG islands, and DNase-hypersensitive sites. Significantly, XMRV integration sites in tumors are commonly found within cancer breakpoints, within common fragile sites, and near miRNA genes, features that are frequently linked with human cancers. Cancer cytogenetics has been a powerful means to pinpoint the locations of cancer-initiating genes, and acquired chromosomal changes have now been reported to occur in more than 50,000 cases across all main cancer types (http://cgap.nci.nih.gov/Chromosomes/Mitelman) (50). Balanced chromosome rearrangements, particularly translocations, are strongly associated with distinct tumor entities and may represent an initial event in oncogenesis (68). The common fragile site is another cancer-associated genomic feature that is frequently altered in non-virus-associated tumors (35). Both cancer breakpoints and common fragile sites are preferential integration targets for vector DNA, hepatitis B virus, and various DNA viruses, including human papillomavirus, Epstein-Barr virus, simian virus 40, and adeno-associated virus. These integration events may contribute significantly to the development of various types of cancers by disrupting the normal activity of tumor suppressor genes or proto-oncogenes in the vicinity (35, 46, 66, 76, 77, 105). In the SCID-X1 gene-therapy trial wherein two patients received an MLV-derived vector and subsequently developed leukemia via activation of the LMO2 oncogene (37), the two integration sites targeted by the MLV-based vector reside within FRA11E, a common fragile site known to correlate with chromosomal breakpoints in tumors (7). Since XMRV integration in DU145 cells does not display a bias for cancer breakpoints and common fragile sites, the high XMRV integration preference seen in tumor samples for genomic regions with the highest frequencies of cancer breakpoints and common fragile sites is striking and likely represents a selection process. The key question of whether these integrated proviruses are an indirect consequence of genomic instability initiated by other genetic lesions or perhaps have a direct role in prostate carcinogenesis awaits further investigations.
Another remarkable finding is that three integration sites identified in three different patient samples are located within a 1.1-Mbp region in 16q22.1. Considering that analysis of this region in acutely infected cells did not reveal an integration hot spot or an increased frequency of integration than other genomic regions with similar sizes and gene densities (data not shown), the high percentage of integration sites located within 16q22.1 in the tumor samples is consistent with a selection event. The cytoband 16q22.1 has been linked to many genetic diseases (for examples, see references 42, 51, and 95), and chromosomal deletion of this region is one of the most common and frequent genetic alterations found in many solid tumors, especially breast and prostate cancers (2, 29, 72). 16q22.1 also overlaps with one of the aphidicolin-induced common fragile sites, FRA16C, and the normal allele of the rare fragile site FRA16B that harbors AT-rich minisatellite repeats (113). This region contains 115 RefSeq genes, many of these associated with cancer. Among these genes, the CDH1 (E-cadherin), DERPC (decreased expression in renal and prostate cancer), NFATc3 (nuclear factor of activated T-cell, cytoplasmic, calcineurin-dependent 3), and HAS3 (hyaluronan synthase 3) genes have been directly linked to prostate cancer development (10, 57, 93, 96). Therefore, genetic instability in this region may have an important contribution to prostate cancer. An additional integration site of interest is in 6q21, a frequently deleted region in prostate cancer (29). Two other integration sites, one each located in 11q13.4 and 19p13.2, are also in regions where high rates of chromosomal alterations have been observed in breast and prostate cancers (19, 20, 44).
Although we did not detect any significant association between XMRV integration sites and any particular proto-oncogene or tumor suppressor gene, especially those implicated in human prostate cancers, such as MYC and PTEN (26), high percentages of cellular genes near the vicinities of the integration sites from prostate cancers matched with the genes listed in the selected cancer gene databases (Table 4). Some of these cancer-related genes, such as the CREB5 (107), NFATc3 (57), PIK3C2β (47, 59), and MDM4 (99) genes, are tightly linked to prostate cancer. Additionally, several other genes were identified to have potential roles in carcinogenesis but were not listed in the selected cancer gene databases. These genes include the APPBP2 (111), HAS3 (92), DLG1 (53), and SPC24 (49) genes.
Many miRNA genes are also present within 2 Mbp of the XMRV integration sites found in tumor samples. Aberrant expression of miRNAs is involved in the initiation, progression, and metastasis of human cancer (13, 32, 60, 98). miR-21 and miR-199a-1, located near integrated XMRV in cancer tissues, are significantly overexpressed in prostate cancer (103). miR-21 is an antiapoptotic and prosurvival factor and can directly modulate the expression of PTEN, a tumor suppressor that is altered in various types of tumors, including prostate (18, 64). miR-196b, another miRNA located near integrated XMRV, has a strong association with estrogen regulation in an adult zebrafish model (22). However, since miRNA genes are closely associated with common fragile sites (14), we do not know if the association between miRNA and XMRV integration sites may just be a consequence of XMRV's preference for common fragile sites or is due to specific interactions between XMRV integration machinery and host cis or trans elements near miRNA genes.
Although the causal relationship between XMRV infection and prostate cancer has not been established, our comparative analyses of XMRV integration site preference between acutely infected cells and prostate cancer tissues are consistent with a paracrine role for XMRV (63). This proposed mechanism is based on our findings that XMRV integration sites in tumor samples are associated with frequent cancer breakpoints, common fragile sites, miRNA, and cancer-related genes, but no common integration site has been detected within or near known proto-oncogenes or tumor suppressor genes. We hypothesize that the integration preference of XMRV for the regulatory region of transcriptionally active genes confers upon the virus a propensity to disrupt or alter gene expression, and cells carrying proviral insertions that provide a selective advantage or a favorable microenvironment for cancer initiation and progression will then be enriched (8, 26). The postulated paracrine mechanism is also consistent with the previous observation that XMRV is detected in only
1% of prostatic stromal and hematopoietic cells rather than carcinoma cells (101). In addition to insertional mutagenesis, we have not ruled out the possibility that native proteins encoded by XMRV may have transformation potential that can alter the growth properties of infected or neighboring cells, as has been demonstrated by some oncogenic retroviruses (33, 84).
Viruses have long been associated with cancers, and an estimated 20 to 25% of human cancers worldwide have known viral etiologies (75). Altered expression of tumor suppressors or proto-oncogenes induced by retrovirus integration is one important cause of cancer induction in animal models (65). Many viruses from the gammaretrovirus genus of the Retroviridae family, such as MLV, feline leukemia virus, and koala retrovirus, are responsible for leukemogenesis and other diseases in their respective host species (83). However, until recently, evidence of authentic infections of humans by gammaretroviruses was lacking, and therefore, human cancer formation caused by such viruses has not been fully substantiated or characterized. XMRV is an authentic human gammaretrovirus and is associated with prostate cancer patients having defective RNase L. Studies for determining the casual relationship of XMRV infection to prostate cancer and the mechanism of oncogenesis will significantly affect our appreciation of the role of viral infection in human cancers and has practical applications in identifying viral or new cellular targets for cancer prevention and treatment.
This work was supported by National Institutes of Health (NIH) grant CA68859 and a Transdisciplinary Cancer Research Grant from the UCLA Jonsson Comprehensive Cancer Center to S.A.C. and by grant W81XWH-07-1-338 from the U.S. Department of Defense Prostate Cancer Research Program, NIH grant CA103943, and the Mal and Lea Bank Chair to R.H.S. S.K. is partly supported by a Dissertation Year Fellowship Award from the UCLA Graduate Division.
Published ahead of print on 6 August 2008. ![]()
Present address: Korean Bioinformation Center, Korean Research Institute of Bioscience and Technology, Daejeon 305-806, South Korea. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»