Over the past few weeks, I have progressed further in my research project – developing a computation tool to infer cancer phylogeny. Now that I am mostly finished with coding the various parts of the application, I have begun testing my code with simulated data, and the error rates…still have much to improve. When I feel discouraged after spending hours pouring over the code and manually inspecting the placement of each mutation, it helps to reflect on the impact of my research and why I decided to embark on this project in the first place.
I hardly need to describe the significance of cancer. It remains one of the leading causes of death worldwide, with millions of new cases diagnosed each year. I know several relatives and friends who are affected by the disease, and you probably do too. No one denies that it is important to find a cure for cancer, and with so much money, time, and effort spent trying to find the cure, the fact that it hasn’t been cured yet speaks to the sheer complexity of the disease. There are over 200 types of cancer, each caused by a different set of mutations in the DNA that results in uncontrollable cell growth and replication. As the tumor grows, new cancer cells inherit the mutations from their parent cells and may gain new mutations that confer different characteristics, resulting in a complicated architecture of tumor cells. Therefore, curing cancer requires a deep understanding of the evolutionary process that drives the disease.

The evolution of cancer cells makes it challenging to anticipate how the disease will progress and find the appropriate treatments. Research in this field is critical not only to understand the fundamental mechanism underlying cancer progression, but also to translate these findings into effective treatments. My research aims to contribute to both efforts. Many computational tools that infer cancer phylogeny already exists, but most of them only considers one type of mutation, whereas cancer cells can acquire many different types of mutations, including single-nucleotide variations (SNVs) and copy number aberrations (CNAs). Not only are there different types of mutations, but also different types of sequencing technologies to detect the mutations.
Single-cell DNA sequencing (scDNA-seq) allows each mutation to be matched to an individual cell, but it cannot detect both SNVs and CNAs simultaneously, and it is expensive. In contrast, bulk-sequencing pools all cells together, resulting in a lower resolution, but lower cost as well. To take advantage of the single-cell resolution of scDNA-seq, ensure the accurate detection of both SNVs and CNAs, and maintain a reasonable cost, my computational tool will infer a phylogenetic tree using SNVs detected from bulk sequencing and CNAs detected from scDNA-seq, which has never been done before. Integrating both types of mutations will reveal insights into the interplay of SNVs and CNAs during cancer progression. Moreover, it will paint a more nuanced picture of the cancer’s evolutionary history to aid doctors in developing an effective treatment strategy.
By developing a computational tool that integrates both SNVs and CNAs from different sequencing technologies, I hope to address a current gap in the field of cancer phylogeny. Ultimately, what motivates me to complete this project is knowing that I will produce a tangible tool that can be used by researchers and clinicians alike in the quest to understand and treat cancer.
Featured Image Source: https://www.genengnews.com/topics/omics/precision-medicine-looks-beyond-dna-sequences/