{"id":2107,"date":"2014-06-09T20:39:01","date_gmt":"2014-06-09T12:39:01","guid":{"rendered":"http:\/\/www.chenlianfu.com\/?p=2107"},"modified":"2014-06-09T20:39:01","modified_gmt":"2014-06-09T12:39:01","slug":"%e4%bd%bf%e7%94%a8masurca%e8%bf%9b%e8%a1%8c%e5%9f%ba%e5%9b%a0%e7%bb%84%e7%bb%84%e8%a3%85","status":"publish","type":"post","link":"http:\/\/www.chenlianfu.com\/?p=2107","title":{"rendered":"\u4f7f\u7528MaSuRCA\u8fdb\u884c\u57fa\u56e0\u7ec4\u7ec4\u88c5"},"content":{"rendered":"<h1>1. MaSuRCA \u7b80\u4ecb<\/h1>\n<p>MaSuRCA(Maryland Super Read Cabog Assembler)\u57fa\u56e0\u7ec4\u7ec4\u88c5\u8f6f\u4ef6\u96c6\u5408\u4e86 de Bruijn \u548c Overlap-Layout-Consensus \u7684\u4f18\u70b9\u3002<br \/>\n\u6587\u732e\uff1a<a href=\"http:\/\/bioinformatics.oxfordjournals.org\/content\/29\/21\/2669.short\" target=\"_blank\">Zimin A V, Mar\u00e7ais G, Puiu                   D, et al. The MaSuRCA genome assembler[J]. Bioinformatics, 2013, 29(21): 2669-2677.<\/a><\/p>\n<h1>2. MaSuRCA \u4e0b\u8f7d\u548c\u5b89\u88c5<\/h1>\n<pre>\r\n$ wget wget ftp:\/\/ftp.genome.umd.edu\/pub\/MaSuRCA\/MaSuRCA-2.2.1.tar.gz\r\n$ tar zxf MaSuRCA-2.2.1.tar.gz -C \/opt\/biosoft\r\n$ cd \/opt\/biosoft\/MaSuRCA-2.2.1\r\n$ .\/install.sh\r\n<\/pre>\n<h1>3. MaSuRCA \u4f7f\u7528<\/h1>\n<h2>3.1 \u914d\u7f6e\u6587\u4ef6\u51c6\u5907<\/h2>\n<p>\u5c06\u6a21\u677f\u914d\u7f6e\u6587\u4ef6 \u201c\/opt\/biosoft\/MaSuRCA-2.2.1\/sr_config_example.txt\u201d \u62f7\u8d1d\u5230\u5f53\u524d\u5de5\u4f5c\u76ee\u5f55\uff0c\u5e76\u4fee\u6539\u4e4b\u3002\u6b64\u914d\u7f6e\u6587\u4ef6\u542b\u6709\u8f93\u5165\u6587\u4ef6\u548c\u53c2\u6570                  \u7684\u4e00\u4e9b\u4fe1\u606f\u3002\u5185\u5bb9\u5982\u4e0b\uff1a<\/p>\n<pre>\r\n# \u6d4b\u5e8f\u6570\u636e\u7684\u4fe1\u606f\u3002\u5206\u4e3a 3 \u79cd\u7c7b\u578b\uff1aPE JUMP OTHER\u3002\u6bcf\u79cd\u7c7b\u578b\u7684\u6570\u636e\u540e\u63a5 5 \u5217\uff1a1\uff092 \u4e2a\u5b57\u7b26\u7684\u524d\u7f00\uff1b2\uff09\u5e73\u5747\u63d2\u5165\u7247\u6bb5\u957f\u5ea6\uff1b3\uff09\u63d2\u5165\u7247\u6bb5\u957f\u5ea6\u6807\u51c6\u5dee\uff1b4\uff09fastq(.gz)\u683c\u5f0f\u7684 reads1; 5\uff09fastq(.gz)\u683c\u5f0f\u7684 reads2\u3002\u5982\u679c\u6709 jump \u6570\u636e\u662f FR \u7c7b\u578b\uff0c\u5219\uff0c\u5219\u4f7f\u7528 JUMP\uff0c\u4f46\u662f\u5e73\u5747\u63d2\u5165\u7247\u6bb5\u957f\u5ea6\u4e3a\u8d1f\u6570\u3002\u5176\u5b83\u7684\u6570\u636e\uff0c\u5219\u5fc5\u987b\u8981\u8f6c\u6362\u6210 Celera \u517c\u5bb9\u7684 .frg \u6587\u4ef6\u3002\r\nDATA\r\nPE= p1 180 20 180_1.fastq 180_2.fastq\r\nPE= p2 500 50 500_1.fastq 500_2.fastq\r\nJUMP= j1 2000 200 2000_1.fastq 2000_2.fastq\r\nJUMP= j2 5000 500 5000_1.fastq 5000_2.fastq\r\nOTHER= file.frg\r\nEND\r\n\r\nPARAMETERS\r\n# \u8bbe\u7f6e k-mer size\uff0c\u5927\u5c0f\u4e3a 25~101\uff0c\u6216\u8005\u4e3a auto\uff0c\u8868\u793a\u81ea\u52a8\u8ba1\u7b97\u6700\u4f18\u503c\u3002\r\nGRAPH_KMER_SIZE=auto\r\n# \u5982\u679c\u4ec5\u5206\u6790 Illumina \u6570\u636e\uff0c\u5219\u503c\u4e3a 1\uff1b\u5982\u679c\u6709 1x \u53ca\u4ee5\u4e0a\u7684 454 \u6570\u636e\uff0c\u5219\u8bbe\u7f6e\u4e3a 0\u3002\r\nUSE_LINKING_MATES=1\r\n# \u5982\u679c jumping library \u7684\u6570\u636e\u8fc7\u591a\uff0c\u53ef\u80fd\u4f1a confuse the assembler\uff0c\u8bbe\u7f6e\u6b64\u503c\u4e3a 60\uff0c\u5219\u4ec5\u4f7f\u7528 60x \u5de6\u53f3\u7684 jumping \u6570\u636e\u7528\u4e8e\u57fa\u56e0\u7ec4\u7ec4                  \u88c5\u3002\u5bf9\u4e8e\u7ec6\u83cc\u57fa\u56e0\u7ec4\uff0c\u4e00\u822c\u8bbe\u7f6e\u4e3a 60\u3002\u5982\u679c\u57fa\u56e0\u7ec4\u8f83\u5927\uff0c\u5219\u8bbe\u7f6e\u6b64\u503c\u5927\u4e9b\u3002\u5bf9\u4e8e\u4e00\u4e9b\u8f83\u5927\u7684\u771f\u6838\u57fa\u56e0\u7ec4\uff0c\u53ef\u4ee5\u5927\u81f3 1000\u3002\r\nLIMIT_JUMP_COVERAGE = 60\r\n# Celera Assembler \u7684\u53c2\u6570\u3002\u5982\u679c\u662f mammals \u7684\u57fa\u56e0\u7ec4\uff0ccgwErrorRate\u7684\u503c\u4e0d\u80fd\u9ad8\u4e8e 0.15\u3002\r\nCA_PARAMETERS = ovlMerSize=30 cgwErrorRate=0.25 ovlMemory=4GB\r\n# \u820d\u5f03\u9891\u6570\u4f4e\u4e8e\u6b64\u503c\u7684 k-mer\u3002\u5982\u679c\u8986\u76d6\u5ea6\u5927\u4e8e 100\uff0c\u53ef\u4ee5\u8bbe\u7f6e\u6b64\u503c\u4e3a 2\u3002\r\nKMER_COUNT_THRESHOLD = 1\r\n# \u8bbe\u7f6e\u4f7f\u7528\u7684\u7ebf\u7a0b\u6570\u3002\r\nNUM_THREADS= $NUM_THREADS\r\n# \u8bbe\u7f6e jellyfish \u7684 hash size\u3002\u6b64\u503c\u53ef\u4ee5\u8bbe\u7f6e\u4e3a \"\u57fa\u56e0\u7ec4\u5927\u5c0f+reads\u7684\u6570\u76ee\"\u3002\r\nJF_SIZE=100000000\r\n# \u8bbe\u7f6e\u662f\u5426 trim long reads \u7684 3' homopolymers\uff08e.g. GGGGGGG)\u3002\u9002\u5408\u4e8e\u9ad8 GC \u542b\u91cf\u7684\u57fa\u56e0\u7ec4\u3002\r\nDO_HOMOPOLYMER_TRIM=0\r\nEND\r\n<\/pre>\n<h2>3.2 \u8fd0\u884c masurca \u548c assemble.sh \u8fdb\u884c\u57fa\u56e0\u7ec4\u7ec4\u88c5<\/h2>\n<p>\u8fd0\u884c\u7a0b\u5e8f masurca\uff0c\u751f\u6210 assemble.sh; \u7136\u540e\u8fd0\u884c assemble.sh \u8fdb\u884c\u7ec4\u88c5\u3002<\/p>\n<pre>\r\n$ \/opt\/biosoft\/MaSuRCA-2.2.1\/bin\/masurca config.txt\r\n$ .\/assemble.sh\r\n<\/pre>\n<h2>3.3 \u8fd0\u884c\u4e2d\u65ad\u540e\u7ee7\u7eed\u8fd0\u884c<\/h2>\n<p>\u7531\u4e8e\u7a0b\u5e8f\u51fa\u9519\uff0c\u6216\u624b\u52a8\u7ec8\u6b62\u540e\uff0c\u53ef\u4ee5\u7ec8\u6b62\u6b65\u9aa4\u6240\u751f\u6210\u7684\u6587\u4ef6\uff0c\u5728\u7ee7\u7eed\u8fd0\u884c masurca \uff0c\u751f\u6210\u542b\u6709\u540e\u7eed\u6b65\u9aa4\u7684 assemble.sh\uff0c\u518d\u7ee7\u7eed\u8fd0\u884c\u7a0b\u5e8f\u3002<\/p>\n<h1> 4. \u7ed3\u679c\u6587\u4ef6<\/h1>\n<p>\u6700\u7ec8\u7684\u7ed3\u679c\u6587\u4ef6\u4e3a CA\/10-gapclose\/genome.ctg.fasta \u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>1. MaSuRCA \u7b80\u4ecb MaSuRCA(Maryland Super Rea &hellip; <a href=\"http:\/\/www.chenlianfu.com\/?p=2107\">\u7ee7\u7eed\u9605\u8bfb <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts\/2107"}],"collection":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2107"}],"version-history":[{"count":1,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts\/2107\/revisions"}],"predecessor-version":[{"id":2108,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts\/2107\/revisions\/2108"}],"wp:attachment":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2107"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2107"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2107"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}