Methods of comparative genome assembly

同一物种得到了多个基因组组装结果,如何将组装结果进行合并,从而使基因组更完整?
Parrish N, Sudakov B, Eskin E. Genome reassembly with high-throughput sequencing data[J]. BMC genomics, 2013, 14(Suppl 1): S8.一文中提到:
A number of software packages have been developed in recent years with the aim of utilizing a set of reference genomes to produce a more optimized scaffolding, or layout, of the contigs produced in de novo assembly. OSLay [20] uses a maximum-weight matching algorithm to identify likely neighboring contigs. Treecat [21] builds a fully connected graph of the contigs, with edges weighted by the distance between syntenic regions in the reference, and attempts to find a minimum-weight Hamiltonian path through the graph using a greedy heuristic. Finally, PGA [22] uses a genetic algorithm to search the space of possible contig orderings. By relying on the contigs produced through de novo assembly, however, these methods may not take full advantage of the reference genome.
从该篇文章开始,寻找方法:

1. AMOScmp

引用文献:Pop M, Phillippy A, Delcher A L, et al. Comparative genome assembly[J]. Briefings in bioinformatics, 2004, 5(3): 237-248.

AMOScmp applies a modified MUMmer algorithm to a newly sequenced genome by mapping it onto a reference genome.

Hawkeye and AMOS are available open source at http://amos.sourceforge.net.

2. Projector 2

引用文献:van Hijum S A F T, Zomer A L, Kuipers O P, et al. Projector 2: contig mapping for efficient gap-closure of prokaryotic genome sequence assemblies[J]. Nucleic acids research, 2005, 33(suppl 2): W560-W566.

文章中写道:
Projector 2 has several distinctive features: a user-friendly web interface, automatic removal of repetitive elements (repeat-masking) and automated primer design for gap-closure purposes. The web interface is freely accessible at http://molgen.biol.rug.nl/websoftware/projector2.

3. OSLay

引用文献:Richter D C, Schuster S C, Huson D H. OSLay: optimal syntenic layout of unfinished assemblies[J]. Bioinformatics, 2007, 23(13): 1573-1579.

该文章中写道:
The underlying algorithm is based on maximum weight matching. The tool provides an interactive visualization of the computed layout and the result can be imported into the assembly editing tool Consed to support the design of primer pairs for gap closure.

OSLay is freely available from: http://www-ab.informatik.unituebingen.de/software/oslay

4. PGA

引用文献:Zhao F, Zhao F, Li T, et al. A new pheromone trail-based genetic algorithm for comparative genome assembly[J]. Nucleic acids research, 2008, 36(10): 3455-3462.

该文章中写道:
A pheromone trail-based genetic algorithm (PGA) was used to search globally for the optimal placement for each contig.An extended version of PGA can predict additional candidate connections for each contig and can thus increase the likelihood of identifying the correct arrangement of each contig. The software and test data sets can be accessed at http://sourceforge.net/projects/pga4genomics/.

5.Treecat

引用文献:Husemann P, Stoye J. Phylogenetic comparative assembly[J]. Algorithms for Molecular Biology, 2010, 5(3).

该文章中写道:
Our new algorithm for contig ordering uses sequence similarity as well as phylogenetic information to estimate adjacencies of contigs. An evaluation of our implementation shows that it performs better than recent approaches while being much faster at the same time.
The software is open source (GPL) and available within the Comparative Genomics – Contig Arrangement Toolsuite (cg-cat, http://bibiserv.techfak.uni-bielefeld.de/cg-cat webcite) on the Bielefeld Bioinformatics Server (BiBiServ).

发表评论

您的电子邮箱地址不会被公开。 必填项已用*标注

此站点使用Akismet来减少垃圾评论。了解我们如何处理您的评论数据