Scientists Solve One of Genomics’ Biggest Challenges by Using HiFi Sequencing to Distinguish Highly Similar Paralogous Genes
Rhea-AI Summary
PacBio (NASDAQ: PACB) has announced a breakthrough study published in Nature Communications showcasing a new method for analyzing complex genomic regions. The research introduces Paraphase, an informatics tool that, when combined with HiFi long-read sequencing, enables high-precision variant detection in 316 previously inaccessible segmental duplication regions.
The study, led by researchers from PacBio, GeneDx, and a global genomics consortium, successfully analyzed 160 long segmental duplication regions, including 9 medically-relevant genes. Key findings include the discovery of 7 new de novo single nucleotide variants and 4 de novo gene conversion events in 36 trios, as well as comprehensive copy number variability analysis across populations.
The research demonstrates Paraphase's ability to analyze complex gene families like CYP21A2/CYP21A1P, which are associated with conditions such as congenital adrenal hyperplasia, spinal muscular atrophy, and red-green color blindness. The technology overcomes traditional sequencing limitations by phasing haplotypes across paralogous gene families, offering improved accuracy in genetic variation detection.
Positive
- Breakthrough in analyzing previously inaccessible genomic regions
- Successfully developed new tool (Paraphase) for improved genetic analysis
- Technology enables analysis of medically relevant genes linked to serious conditions
- Demonstrated superior accuracy compared to traditional sequencing methods
Negative
- None.
Insights
PacBio's publication in Nature Communications represents a significant technical breakthrough in genomic analysis. Their HiFi sequencing technology, paired with the new Paraphase informatics tool, has successfully analyzed 316 genes in segmental duplication regions that were previously inaccessible to conventional sequencing methods.
The technical achievement is remarkable - these paralogous genes (nearly identical copies) have long been the "dark matter" of genomics. The ability to distinguish between highly similar gene copies like SMN1/SMN2 (linked to spinal muscular atrophy) and CYP21A2 (associated with congenital adrenal hyperplasia) opens new research avenues previously blocked by technological limitations.
What separates this from incremental advances is the demonstration that PacBio's long-read approach can resolve complex genomic structures that short-read technologies fundamentally cannot address due to their inherent limitations. The identification of previously undetected de novo variants and gene conversion events showcases the technology's potential to reveal hidden genetic mechanisms.
The collaboration with GeneDx points toward potential clinical applications, suggesting a path from research discovery to diagnostic implementation. However, the transition from publication to clinical adoption typically requires additional validation studies and regulatory considerations.
For PacBio, this advances their competitive positioning in specialized sequencing applications where accuracy in complex regions is paramount. While immediate revenue impact may be modest, the technology enhancement strengthens their value proposition in the high-precision genomic analysis segment.
PacBio's Nature Communications study demonstrates a distinctive competitive advantage in one of sequencing's most challenging areas. By enabling accurate analysis of segmental duplications, PacBio addresses a persistent technological gap that has genomic research and diagnostic capabilities.
This advancement differentiates PacBio's platform in the crowded sequencing market. While competitors dominate in throughput and cost efficiency for straightforward genomic regions, PacBio now showcases superior capabilities in these complex regions containing medically relevant genes.
The technology's ability to phase haplotypes across paralogous gene families leverages the unique combination of read length and accuracy that defines PacBio's HiFi approach. This creates a technical moat in specialized applications where alternative technologies fundamentally fall short.
The collaboration with GeneDx, a clinical genomics provider, suggests potential commercialization pathways in diagnostic testing. Conditions like spinal muscular atrophy and congenital adrenal hyperplasia represent high-value diagnostic targets where improved accuracy could translate to better clinical outcomes.
For investors, this reinforces PacBio's technological leadership in accuracy-critical applications and suggests potential expansion into specialized clinical markets. While the direct revenue impact may develop gradually as research advances translate to clinical applications, this technological differentiation strengthens PacBio's competitive positioning in the precision genomics landscape.
MENLO PARK, Calif., March 17, 2025 (GLOBE NEWSWIRE) -- PacBio (NASDAQ: PACB), a leading provider of high-quality, highly accurate sequencing platforms, today announced a newly published study in Nature Communications unveiling a powerful new method for analyzing some of the most complex regions of the human genome. Led by researchers from PacBio, GeneDx, and a global consortium of genomics experts, the study utilizes Paraphase, an informatics tool that, when paired with HiFi long-read sequencing, allows for high-precision variant detection and copy number analysis in 316 previously inaccessible segmental duplication regions, including 9 challenging medically-relevant genes.
Segmental duplications (SDs) are highly similar, duplicated regions of the genome that have posed persistent challenges for genetic analysis. These regions contain hundreds of genes critical to human health—including those implicated in spinal muscular atrophy (SMN1/SMN2), congenital adrenal hyperplasia (CYP21A2), and red-green color blindness (OPN1LW/OPN1MW)—but their high sequence similarity makes accurate mapping and variant detection nearly impossible with short-read sequencing. Paraphase, combined with HiFi sequencing, overcomes these challenges by phasing haplotypes across paralogous gene families, providing a more complete and accurate view of genetic variation. This is enabled by the length and accuracy of reads from HiFi sequencing.
Study Reveals Previously Inaccessible Regions of the Genome
By applying Paraphase to 160 long (>10 kb) segmental duplication regions spanning 316 genes, the researchers revealed new insights into genetic variation across five ancestral populations.
Among the key findings:
- Newly Identified De Novo Variants in SDs in Parent-Offspring Trios: Analysis of 36 trios uncovered 7 previously undetected de novo single nucleotide variants (SNVs) and 4 de novo gene conversion events, two of which were non-allelic—a level of detail not possible with traditional sequencing approaches.
- Copy Number Variability Across Populations: The study profiled the copy number distributions of paralog groups across populations, showing high copy number variability in many gene families in SDs. It also provided a new approach for identifying false duplications in the reference genome.
- Gene Conversion Drives Sequence Similarity between Genes and Paralogs: The team identified 23 paralog groups with strikingly low genetic diversity between genes and paralogs, indicating that frequent gene conversion and/or unequal crossing-over may have played a role in preserving highly similar gene copies over time.
“For decades, sequencing technologies have struggled to provide reliable data on paralogous genes—some of the most medically relevant but hardest to analyze regions of the genome,” said Dr. Michael A. Eberle, Vice President of Bioinformatics at PacBio and senior author of the study. “With Paraphase and HiFi sequencing, we now have a scalable way to accurately genotype SD-encoded genes across diverse populations, filling in long-standing gaps in genomic research and improving our ability to identify disease-linked variants.”
The study also highlights how Paraphase can disentangle medically important gene families that have long required specialized, multi-step assays like MLPA and Sanger sequencing. For example, in the CYP21A2/CYP21A1P region—where mutations cause congenital adrenal hyperplasia—the researchers characterized a previously overlooked duplication allele carrying both a functional CYP21A2 copy and a CYP21A2(Q319X) pseudogene copy, which could have led to misclassification in standard tests.
“This study demonstrates that when we use HiFi sequencing we see a much richer and more complex picture of genetic variation,” said Dr. Xiao Chen, lead author of the study and principal scientist at PacBio. “Paraphase enables the precise resolution of genetic regions that have been largely inaccessible until now, providing new opportunities for disease research, population genetics, and potentially even clinical testing.”
“Long-read genome sequencing offers the ability to detect variants that are difficult to identify using other testing methods, particularly in regions with highly similar sequence,” said Dr. Paul Kruszka, MD, FACMG, Chief Medical Officer at GeneDx. “This work may enhance variant detection, resolve complex genomic regions, and provide more answers for patients and families, so we are encouraged by the prospect of the data.”
The full study, “Genome-wide profiling of highly similar paralogous genes using HiFi sequencing,” is now available in Nature Communications.
About PacBio
PacBio (NASDAQ: PACB) is a premier life science technology company that is designing, developing and manufacturing advanced sequencing solutions to help scientists and clinical researchers resolve genetically complex problems. Our products and technologies stem from two highly differentiated core technologies focused on accuracy, quality and completeness which include our HiFi long-read sequencing and our SBB® short-read sequencing technologies. Our products address solutions across a broad set of research applications including human germline sequencing, plant and animal sciences, infectious disease and microbiology, and oncology. For more information, please visit www.pacb.com and follow @PacBio.
PacBio products are provided for Research Use Only. Not for use in diagnostic procedures.
Forward Looking Statements
This press release may contain “forward-looking statements” within the meaning of Section 21E of the Securities Exchange Act of 1934, as amended, and the U.S. Private Securities Litigation Reform Act of 1995. All statements other than statements of historical fact are forward-looking statements, including statements relating to the uses, coverage, advantages, and benefits or expected benefits of using, PacBio products or technologies, including in connection with providing a scalable way to accurately genotype SD-encoded genes across diverse populations, fill in long-standing gaps in genomic research, and improve the ability to identify disease-linked variants; enabling precise resolution of genetic regions that were previously largely inaccessible; providing new opportunities for disease research, population genetics, and potential clinical testing; potentially detecting or enhancing the detection of variants difficult to identify using other methods, resolving complex genomic regions, and providing more answers for patients and families; and other future events. You should not place undue reliance on forward-looking statements because they are subject to assumptions, risks, and uncertainties and could cause actual outcomes and results to differ materially from currently anticipated results, including, the difficulty of generating discoveries in complicated areas of biology; potential performance, quality and regulatory issues; and third-party claims alleging infringement of patents and proprietary rights or seeking to invalidate PacBio's patents or proprietary rights. Additional factors that could materially affect actual results can be found in PacBio's most recent filings with the Securities and Exchange Commission, including PacBio's most recent reports on Forms 8-K, 10-K, and 10-Q, and include those listed under the caption "Risk Factors." These forward-looking statements are based on current expectations and speak only as of the date hereof; except as required by law, PacBio disclaims any obligation to revise or update these forward-looking statements to reflect events or circumstances in the future, even if new information becomes available.
Contacts
Investors and Media:
Todd Friedman
ir@pacificbiosciences.com
Media:
ir@pacificbiosciences.com