A Method and Software Significantly Improving the Accuracy Of Genome Assemblies: SEQuel

University of California System: University of California, San Diego
posted on 07/27/2012

Assemblies of next generation sequencing (NGS) data, while accurate, still contain a substantial number of errors that need to be corrected after the assembly process. Earlier assembly algorithms developed for Sanger sequencing follow an “overlap – layout – consensus” paradigm, where consensus refers to fixing errors in the contigs. Since this paradigm faces difficulties in short read assembly, most NGS assemblers employ a de Bruijn graph approach that effectively deals with large amounts of data. However, most NGS assemblers neglect the consensus step, i.e. , there exists no postprocessing of the contigs in Velvet and many other popular assemblers. Relying on high and uniform coverage, NGS assembly algorithms push the burden of producing high quality assemblies onto the construction of the de Bruijn graph. Our work demonstrates that NGS assemblers can benefit from the use of a consensus step. There are currently no tools that aim to accomplish this same goal.

Suggested Uses

Correcting errors in contigs from high throughput sequencing (HTS) assemblies. These might include bacterial/plant/vertebrate genomes that were not been previously sequenced, or the products of transcript assembly.


  • Removed 35% to 96% of small-scale assembly errors.
  • Introduced positional de Bruijn graph for contig refinement.
  • Demonstrated utility in hard (single-cell) assembly.
  • SEQuel can be used in combination with any NGS assembler.

Innovation Details

Detailed Description

UCSD researchers have recently developed a method and companion software, SEQuel, to correct errors (i.e., insertions, deletions, and substitution errors) in the assembled contigs of NGS data. Fundamental of SEQuel is the positional de Bruijn graph, a graph structure that models k-mers within reads while incorporating the approximate positions of reads into the model. SEQuel takes as input an assembled contig, the paired-end reads that align to that contig and the approximate positions where they aligned, and returns a refined contig.

File Number: 22625 

IP Protection

Copyright: ©2012, The Regents of the University of California

License Online

This innovation currently is not available for online licensing. Please contact University of California, San Diego Technology Transfer Office at University of California System: University of California, San Diego for more information.

Request more info via email request more info

Download Technology Brief (PDF)

Followed By

Follow this innovation

No one is following this innovation.

Related Tags

Find more innovations

February 11, 2009

13,565 members 17,723 innovations 176 organizations


Scott Steele, coordinator of the CTSA-IP initiative and director of research alliances at the University of Rochester

"With more than 3,700 innovations from CTSA member institutions already on the iBridge Network, we're garnering worldwide exposure for the breakthroughs our researchers are accomplishing while moving toward our goal of increasing human health through clinical and translational research."  read more...