Finding Cause-and-Affect Genetic Variants for Traits and Diseases Using CRISPR and Single-Cell Sequencing

Finding Cause-and-Affect Genetic Variants for Traits and Diseases Using CRISPR and Single-Cell Sequencing

Understanding whether regions of the genome are responsible for particular features or increase the risk of certain diseases is a difficult task in human genetics. For genetic variations discovered in the 98% of the genome that does not encode proteins, this problem is considerably more severe.

In order to overcome these obstacles and identify causal variants and genetic pathways underlying blood cell properties, researchers from New York University and the New York Genome Center have created a novel strategy that combines genetic association studies, gene editing, and single-cell sequencing.

In addition to addressing the problem of directly linking genetic variants to human features and health, their method, known as STING-seq and reported in Science, can aid researchers in locating potential therapeutic targets for conditions having a genetic foundation.

Genome-wide association studies (GWAS) have grown in importance as a method for analyzing the human genome during the past 20 years. Scientists have discovered thousands of genetic changes or variants using GWAS that are linked to several disorders, including schizophrenia, diabetes, and features like height. Large populations’ genomes are compared in these research to identify mutations that are more prevalent in people who have a certain disease or trait.

GWAS can reveal what regions of the genome and potential variants are implicated in diseases or traits. However, compared to the well-researched 2% of the genome that codes for proteins, these relationships are almost invariably identified in the 98% of the genome that does not do so.

The fact that numerous variants in the genome are found near to one another and pass through generations together through a process known as linkage adds to the complexity. Due to this, it may be challenging to distinguish between a variant that actually causes an effect and variants that are merely adjacent. Even when scientists can identify which variant is causing a disease or trait, they do not always know what gene the variant impacts.

The huge success in GWAS has highlighted the challenge of extracting insights into disease biology from these massive data sets. Despite all of our efforts during the past 10 years, the glass was still just half full at best. We needed a new approach.

Tuuli Lappalainen

“A major goal for the study of human diseases is to identify causal genes and variants, which can clarify biological mechanisms and inform drug targets for these diseases,” said Neville Sanjana, associate professor of biology at NYU, associate professor of neuroscience and physiology at NYU Grossman School of Medicine, a core faculty member at New York Genome Center, and the study’s co-senior author.

“The huge success in GWAS has highlighted the challenge of extracting insights into disease biology from these massive data sets. Despite all of our efforts during the past 10 years, the glass was still just half full at best. We needed a new approach,” said Tuuli Lappalainen, senior associate faculty member at the New York Genome Center, professor of genomics at the KTH Royal Institute of Technology in Sweden, and the study’s co-senior author.

A cure for sickle cell anemia

A recent development in the treatment of sickle cell anemia, a hereditary condition characterized by episodes of excruciating pain, shows how GWAS combined with state-of-the-art molecular technologies like gene editing can uncover causal variations and result in novel therapeutics.

Using GWAS, scientists identified areas of the genome important for producing fetal hemoglobin, a target based on its promise for reversing sickle cell anemia, but they did not know which exact variant drives its production.

The researchers turned to CRISPR a gene editing tool that uses “molecular scissors to cut DNA,” according to Sanjana to edit the regions identified by GWAS. When CRISPR edits were made at a specific location in the noncoding genome near a gene called BCL11A, it resulted high levels of fetal hemoglobin.

The bone marrow cells of dozens of sickle cell anemia patients have now been edited using CRISPR in clinical trials. After being injected back into the patients, the altered cells start to produce fetal hemoglobin, which replaces the adult type of hemoglobin that has undergone mutation, effectively healing the sickle-cell disease.

“This success story in treating sickle cell disease is a result of combining insights from GWAS with gene editing,” said Sanjana. “But it took years of research on only one disease. How do we scale this up to better identify causal variants and target genes from GWAS?”

GWAS meets CRISPR and single-cell sequencing

The research team created a workflow called STING-seq Systematic Targeting and Inhibition of Noncoding GWAS loci with single-cell sequencing. STING-seq works by taking biobank-scale GWAS and looking for likely causal variants using a combination of biochemical hallmarks and regulatory elements. Then, the team uses CRISPR to specifically target each of the GWAS-indicated regions of the genomes, followed by single-cell sequencing to assess gene and protein expression.

In their study, the researchers illustrated the use of STING-seq to discover target genes of noncoding variants for blood traits. Simple to quantify in standard blood tests, blood features like the proportions of platelets, white blood cells, and red blood cells have received extensive GWAS research. As a result, the researchers were able to use GWAS representing nearly 750,000 people from diverse backgrounds to study blood traits.

Once the researchers identified 543 candidate regions of the genome that may play a role in blood traits, they used a version of CRISPR called CRISPR inhibition that can silence precise regions of the genome.

After CRISPR silencing of regions identified by GWAS, the researchers looked at the expression of nearby genes in individual cells to see if particular genes were turned on or off. If they saw a difference in gene expression between cells where variants were and were not silenced, they could link specific noncoding regions to target genes.

By doing this, the researchers could pinpoint which noncoding regions are central to specific traits (and which ones are not) and often also the cellular pathways through which these noncoding regions work.

“The power of STING-seq is we could apply this approach to any disease or trait,” said John Morris, a postdoctoral associate at the New York Genome Center and NYU and the first author of the study.

The guesswork that scientists previously experienced when dealing with linkage among variants or genes closest to variants, which are frequently but not always the target gene, is eliminated when using STING-seq to examine clusters of plausible variants and observe their influence on genes.

Applying CRISPR to a blood property termed monocyte count made one gene, CD52, stand out as being significantly altered. CD52 was close to the variant of interest, but it was not the closest gene, so it may have been missed using earlier techniques.

In another analysis, the researchers identified a gene called PTPRC that is associated with 10 blood traits, including those related to red and white blood cells and platelets. It was difficult to determine which (if any) of the numerous GWAS-identified noncoding variations in the area could affect PTPRC expression because they are so close together.

Applying STING-seq enabled them to isolate which variants were causal by seeing which changed PTPRC expression.

STING-seq and beyond

While STING-seq can identify the target gene and causal variant by silencing the variants, it does not explain the direction of the effect whether a specific noncoding variant will crank up or reduce expression of a nearby gene.

The researchers took their approach a step further to create a complementary approach they call beeSTING-seq (base editing STING-seq) that uses CRISPR to precisely insert a genetic variant instead of just inhibiting that region of the genome.

The researchers hope to employ STING-seq and beeSTING-seq to find the root causes of a variety of disorders that can then be addressed either by gene editing, as was done for sickle cell anemia, or through the use of medications that target particular genes or cellular pathways.

“Now that we can connect noncoding variants to target genes, this gives us evidence that either small molecules or antibody therapies could be developed to change the expression of specific genes,” said Lappalainen.

Additional study authors include Christina Caragine, Zharko Daniloski, Lu Lu, and Kyrie Davis, of NYU and the New York Genome Center; Júlia Domingo, Marcello Ziosi, Dafni Glinos, Stephanie Hao, Eleni P. Mimitou, and Peter Smibert of the New York Genome Center; Timothy Barry and Kathryn Roeder of Carnegie Mellon University; and Eugene Katsevich of the University of Pennsylvania.