Introducing
Your new presentation assistant.
Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.
Trending searches
Introduction
Discussion
Transposable elements, also termed transposons, are repetitive DNA sequences that can change position in the genome. Currently, 45% of the human genome is composed of mostly repressed transposons, which now act as gene regulation modules. De-repression of these existing transposons, therefore, disrupts genome stability and has proved to be harnessed to drive oncogenes in cancer patients. The few active transposons mostly exist in LINE-1. IDetection of transposons in cancer genome sequencing will allow further discovery of these events and assessment of their clinical impact. The question remains: how reliable are current transposon detection methods? In this project, different tools (TraFiC-mem, BLAT search, and deStruct) are performed on a cancer patient’s genome to detect transposition activity in the said genome.
Clearly, the effects of transposable elements may be important in cancer development, especially in LINE1, even if many of them have become repressed. Despite the potential role it plays in promoting cancer, there currently lacks reliability in the detection of transposable elements. For example, there is discordance between TraFiC and deStruct methods shown in this experiment. The charts below are from other comparisons of transposable element detection, and it's clear that no single pair of transposable element detection tools can mostly agree with each other. If we look at TraFiC in the yellow bubble in the top right, it’s pretty clear that the 99% precision that PCAWG claimed for TraFiC might again be by chance, since it is clearly missing many predictions. We hypothesize that the genomes inputted into TraFiC by the authors of the PCAWG paper were very specific in the loci and type of transposon, allowing TraFiC to play to its strengths and detect them all. However, clearly, it is not universally effective. With the randomness of transposon activity and existence, it's important to be able to predict them reliably without relying on any conditions.
LINE-1 transposons are the most common active elements. They are a class 1 non-LTR transposon and make up 17% of the human genome. LINE-1 are thought to have persisted in humans through the evolution of regulating and silencing mechanisms, and mutations that render them inactive. For example, the Human Genome contains 500,000 LINE-1 insertions, which is 17% of the human genome, but 99% are no longer mobile. Despite the fact that 99% of them are inactive, within the context of de-repression, LINE-1 still drives many diseases through new insertions, deletions, and recombinations. The figure to the left shows the cycle of a LINE-1 retrotransposition event.
Upon inspection of the breakpoint density chart we can see that there is high breakpoint density at chromosome 3, and potentially at 1, 8, and 22, meaning that there may be candidates for transposon identification here. Indeed, there was at least one LINE 1 transposon at each of these chromosomes as detected by UCSC’s Human BLAT search.
The matrix to the left shows the existence of transpositions predicted by deStruct and UCSC BLAT where each of the blue lines are connected between two chromosomes. As seen, most transposition events stem out of chromosome 3, which corroborates with the breakpoint density graph from earlier showing the highest breakpoint density at chromosome 3. In short, chromosome 3 is where the LINE-1 is and the other locations have novel insertions that stem from 3.
The left tree below is a phylogeny relating the six samples from the patient. The squares on the right are breakpoints identified by deStruct. It can be seen that the breakpoints follow the tree. For example, a breakpoint which infers existence of transposition at the "magenta" point will also get "passed down" the phylogeny and have breakpoints in its descendants, such as at the orange point which developed into right ovary 1. This indicates that the results from deStruct follow what is expected.
Methods
1. deStruct - Used for prediction of rearrangement breakpoints. Identify areas of high bp density, which could be hotspot locations for DNA deletion, insertions, etc. An active L1 is likely to reinsert at multiple locations throughout the genome leading to many breakpoints for which one end of the breakpoint overlaps with the original L1. Prediction ID's at chromosomes with high bp density inspected further for L1 presence.
2. UCSC BLAT search - Inputted the breakpoint sequences from the chromosomes and prediction ID's found in deStruct, and identified overlap with instances of L1.
Transposable element detection is important yet undeveloped. There are many different tools out there, but it’s clear that they tend to contradict with each other’s findings.
With the role we now know that transposons play in potentially promoting cancer, it’s important to be able to advance in this area of research to be able to better understand and trace cancer development...
As for the predictions from TraFiC, the results initially came out empty, indicating zero transposons. However, we knew that this was not the case, so TraFiC had to be debugged and rerun (link: https://gitlab.com/amcpherson/TraFiC). The results after re-running TraFiC are shown in the heat map on the right, where the light beige color indicates the existence of transposable elements. TraFiC did not predict any transposons at chromosome 3, which contradicts deStruct's findings. Likewise, the transposable elements predicted by TraFiC don't follow a likely phylogeny. In other words, there are predicted transposons at descendants on opposite sides of a phylogeny tree, but no predicted transposons in between, which means that the ancestors might not have had transposons and that these developed independently. This is unlikely and contradicts the Principle of Parsimony.
3. TraFiC-mem - inputted same patient's bam file into detection tool. This is a recent transposon detection tool used in Bernardo Rodriguez-Martin's "Pan-cancer analysis of whole genomes identifies driver rearrangements promoted by LINE-1 retrotransposition". The paper confirmed its high accuracy rate with a false positive rate of less than 5%. Likewise, TraFiC demonstrated a high precision rate of 99%. They validate this by single-molecule whole-genome sequencing data analysis of one cancer cell line with high retrotransposition rate as well as by reanalyzing a mock cancer genome into which had previously seeded somatic retrotransposition events at different levels of tumor clonality, and then simulated sequencing reads to the average level of coverage of the PCAWG dataset. The figure on the right shows how TraFic-mem works.
Rodriguez-Martin, et al., 2020