{"id":1747,"date":"2013-07-11T15:41:45","date_gmt":"2013-07-11T07:41:45","guid":{"rendered":"http:\/\/www.hzaumycology.com\/chenlianfu_blog\/?p=1747"},"modified":"2013-10-25T22:22:46","modified_gmt":"2013-10-25T14:22:46","slug":"%e5%9f%ba%e5%9b%a0%e7%bb%84repeat-sequence%e7%9a%84%e5%af%bb%e6%89%be","status":"publish","type":"post","link":"http:\/\/www.chenlianfu.com\/?p=1747","title":{"rendered":"\u57fa\u56e0\u7ec4Repeat Sequence\u7684\u5bfb\u627e"},"content":{"rendered":"<h1>1. \u91cd\u590d\u5e8f\u5217\u7b80\u4ecb\u548c\u76f8\u5173\u8f6f\u4ef6<\/h1>\n<p>\u53c2\u8003\u81ea\u6587\u732e\uff1a<a href=\"http:\/\/www.nature.com\/hdy\/journal\/v104\/n6\/full\/hdy2009165a.html\" target=\"_blank\">Review\uff1aIdentifying repeats and transposable elements in sequenced genomes: how to find your way through the dense forest of programs\u3002E Lerat\u3002Heredity (2010) 104, 520\u2013533; doi:10.1038\/hdy.2009.165; published online 25 November 2009<\/a><\/p>\n<h2>1.1 Repeats\u7684\u5206\u7c7b<\/h2>\n<p>\u57fa\u56e0\u7ec4\u4e2d\u7684repeats\u4f9d\u636e\u5176\u5e8f\u5217\u7279\u5f81\u5206\u62102\u7c7b\uff1a\u4e32\u8054\u91cd\u590d(tandem repeats) \u548c \u6563\u5728\u5206\u5e03\u5728\u57fa\u56e0\u7ec4\u4e2d\u7684\u91cd\u590d\u5e8f\u5217(interspersed repeats).\u5176\u4e2d\u7b2c\u4e8c\u7c7b\u4e3b\u8981\u662ftransposable elements(TEs).<\/p>\n<p>\u7b2c\u4e00\u7c7b\u4e32\u8054\u91cd\u590d\u5305\u542b\uff1amicrosatellites \u6216 simple sequence repeats\uff081-6\u4e2a\u78b1\u57fa\u4e3a\u4e00\u4e2a\u91cd\u590d\u5355\u5143\uff09 \u548c minisatellites\uff0810-60\u4e2a\u78b1\u57fa\u7684\u957f\u5e8f\u5217\u4e3a\u4e00\u4e2a\u91cd\u590d\u5355\u5143\uff09.<\/p>\n<p>TEs\u5305\u542b2\u79cd\u7c7b\u578b\uff1aclass-I TEs\u901a\u8fc7RNA\u4ecb\u5bfc\u7684(copy and paste\uff09\u673a\u5236\u8fdb\u884c\u8f6c\u5ea7\uff1bclass-II TEs\u901a\u8fc7DNA\u4ecb\u5bfc\u7684(cut and paste\uff09\u673a\u5236\u6765\u8f6c\u5ea7. \u524d\u8005\u79f0\u4e3aretroelements\uff0c\u540e\u8005\u79f0\u4e3aDNA transposons\u3002<\/p>\n<p>class-I TEs\u4e2d\u4e3b\u8981\u7531LTR(long terminal repeat)\u6784\u6210\u3002LTR\u7684\u90e8\u5206\u5e8f\u5217\u53ef\u80fd\u5177\u6709\u7f16\u7801\u529f\u80fd\u3002\u800cnon-LTR\u5219\u5305\u542b2\u4e2a\u5b50\u7c7b\uff1aLINEs(long interspersed nuclear elements)\u548cSINEs(short interspersed elements\uff09\uff0c\u5176\u4e2d\u524d\u8005\u53ef\u80fd\u5177\u6709\u7f16\u7801\u529f\u80fd\uff0c\u540e\u8005\u5219\u6ca1\u6709\u3002<\/p>\n<p>class-II TEs\u4e2d\u52a0\u5165\u4e86\u4e00\u4e2a\u5b50\u7c7b MITEs(miniature inverted repeat transposable elements),\u57fa\u4e8eDNA\u7684\u8f6c\u5ea7\u56e0\u5b50\uff0c\u4f46\u662f\u786e\u901a\u8fc7&#8221;copy and paste&#8221;\u7684\u673a\u5236\u6765\u8f6c\u5ea7(Wicker et al., 2007)\u3002<\/p>\n<h2>1.2 \u9274\u5b9a\u91cd\u590d\u5e8f\u5217\u7684\u8f6f\u4ef6<\/h2>\n<p>\u5bf9\u4e8e\u4e0d\u540c\u7684\u91cd\u590d\u5e8f\u5217\uff0c\u9700\u8981\u4f7f\u7528\u4e0d\u540c\u7684\u8f6f\u4ef6\u6765\u8fdb\u884c\u9274\u5b9a\u3002\u800c\u9274\u5b9a\u7684\u65b9\u6cd5\u53ef\u4ee5\u5206\u4e3a\uff1a\u57fa\u4e8elibrary\uff0c\u57fa\u4e8e\u91cd\u590d\u5e8f\u5217\u7684\u7279\u5b9a\u7ed3\u6784 \u6216 \u91cd\u5934\u9884\u6d4b\u3002\u6587\u732e\u4e2d\u7ed9\u51fa\u7684\u8f6f\u4ef6\u5f88\u591a\uff1a<a href=\"http:\/\/www.nature.com\/hdy\/journal\/v104\/n6\/fig_tab\/hdy2009165t1.html#figure-title\" target=\"_blank\">http:\/\/www.nature.com\/hdy\/journal\/v104\/n6\/fig_tab\/hdy2009165t1.html#figure-title<\/a>\u3002<\/p>\n<h3>1.2.1 \u57fa\u4e8elibrary-based\u7684\u8f6f\u4ef6<\/h3>\n<p>library-based\u7684\u8f6f\u4ef6\uff0c\u9700\u8981\u6784\u5efalibrary\uff0c\u8be5library\u4e2d\u5305\u542b\u5f88\u591a\u6765\u81ea\u4e0d\u540c\u7269\u79cd\u67d0\u4e00\u91cd\u590d\u5e8f\u5217\u7684\u4e00\u81f4\u6027\u5e8f\u5217\uff0c\u7136\u540e\u901a\u8fc7\u76f8\u4f3c\u6027\u6bd4\u5bf9\u6765\u9274\u5b9arepeats\u3002\u8fd9\u79cd\u65b9\u6cd5\u80fd\u5bf9\u6240\u6709\u7684\u79cd\u7c7b\u7684\u91cd\u590d\u5e8f\u5217\u8fdb\u884c\u9274\u5b9a\u3002\u6b64\u65b9\u6cd5\u6700\u7ecf\u5178\u6700\u6d41\u884c\u7684\u8f6f\u4ef6\u662fRepeatMasker\uff1b\u6b64\u65b9\u6cd5\u4e2dCENSOR\u548cMASKERAID\u4e24\u4e2a\u8f6f\u4ef6\u53ef\u4ee5\u7528\u4e8e\u6539\u826fRepeatMasker\u7684\u7ed3\u679c\uff1b\u6b64\u5916\uff0c\u7528\u4e8e\u57fa\u56e0\u7ec4\u7684\u91cd\u590d\u5e8f\u5217\u9274\u5b9a\u7684\u8fd8\u6709GREEDIER\uff08Li et al.,2008\uff09\uff0c\u8be5\u8f6f\u4ef6\u5728\u5176\u6587\u7ae0\u4e2d\u8868\u660e\u8be5\u8f6f\u4ef6\u6027\u80fd\u8fd8\u4e0d\u9519\uff0c\u5728repeats\u9274\u5b9a\u7684\u654f\u611f\u6027\u4e0a\u7a0d\u5fae\u6bd4RepeatMasker\u9ad8\u4e00\u70b9\uff0c\u4f46\u662frepeats\u7684\u9274\u5b9a\u7387\u53ea\u6709RepeatMasker\u7684\u4e00\u534a\u5de6\u53f3.<\/p>\n<h3>1.2.2 \u57fa\u4e8esignature\u7684\u8f6f\u4ef6<\/h3>\n<p>\u57fa\u4e8esignature\u7684\u65b9\u6cd5\u4e3b\u8981\u7528\u4e8eTEs\u7684\u9274\u5b9a\u3002<br \/>\n1. \u9274\u5b9aLTR\u9006\u8f6c\u5ea7\u5b50: LTR_STRUC (McCarthy and McDonald, 2003), LTR_PAR (Kalyanaraman and Aluru, 2006), FIND_LTR (Rho et al., 2007), RETROTECTOR (Sperber et al., 2007), LTR_FINDER (Xu and Wang, 2007) and LTRHARVEST (Ellinghaus et al., 2008)\u3002\u6587\u732e\u4e2d\uff0c\u8fd9\u4e9b\u8f6f\u4ef6\u4e2dLTR_STRUC\u7684\u654f\u611f\u6027\u6700\u9ad8(96%)\uff0c\u4f46\u662fLTR\u7684\u9274\u5b9a\u7387\u53ea\u670930%\uff1b\u800cLTRharvest\u7684\u9274\u5b9a\u7387\u6700\u9ad8(42%),\u654f\u611f\u602767%.\u603b\u4f53\u4e0a\uff0c\u4f5c\u8005\u4f9d\u6b21\u63a8\u8350\u7684\u8f6f\u4ef6\u662fLTRHARVEST\u548cFIND_LTR(\u654f\u611f\u602783%\uff0c\u9274\u5b9a\u738737%)\u3002<br \/>\n2. \u9274\u5b9anon-LTR retrotransposons: TSDFINDER (Szak et al., 2002), SINEDR (Tu et al., 2004) and RTANALYZER (Lucier et al., 2007)\u3002\u5176\u4e2d\uff0c\u7b2c\u4e00\u4e2a\u8f6f\u4ef6\u7528\u4e8e\u9a8c\u8bc1RepeatMasker\u68c0\u6d4b\u5230\u7684L1 insertions\uff1b\u7b2c\u4e8c\u4e2a\u8f6f\u4ef6\u7528\u4e8e\u68c0\u6d4b\u4fa7\u7ffc\u6709TSDs\uff08target site duplications \u5f53\u91cd\u590d\u5e8f\u5217\u63d2\u5165\u5230\u57fa\u56e0\u7ec4\u4e0a\u65f6\uff0c\u5176\u4e24\u4fa7\u4f1a\u5e26\u5165\u77ed\u6838\u9178\u5e8f\u5217\u7684\u91cd\u590d\uff09\u7684SINEs\uff1b\u7b2c\u4e09\u4e2a\u8f6f\u4ef6\u901a\u8fc7\u4e00\u4e9b\u7279\u5f81\uff0c\u6bd4\u5982TSDs\uff0cpolyA\u5c3e\u548c5&#8217;\u7aef\u6838\u9178\u5185\u5207\u9176\u4f4d\u70b9\u7b49\u6765\u901a\u8fc7\u6253\u5206\u6765\u68c0\u6d4bL1\u9006\u8f6c\u5ea7\u5b50\u3002<br \/>\n3. \u9274\u5b9aMITEs:FINDMITE (Tu, 2001), TRANSPO (Santiago et al., 2002), MITE Analysis Kit (MAK) (Yang and Hall, 2003) and MITE Uncovering SysTem (MUST) (Chen et al., 2009)\u3002\u6587\u732e\u4e2d\u4f5c\u8005\u4f7f\u7528\u7b2c\u4e00\u4e2a\u8f6f\u4ef6\u62a5\u9519\uff0c\u4f7f\u7528\u7b2c\u4e09\u4e2a\u8f6f\u4ef6\u5374\u4e0b\u8f7d\u4e0d\u5230\u3002\u7b2c\u4e8c\u4e2a\u8f6f\u4ef6\u4e0d\u80fd\u5bfb\u627e\u65b0\u7684MITEs\uff0c\u770b\u6765\u6700\u597d\u662f\u4f7f\u7528\u6700\u65b0\u7684\u7b2c\u56db\u4e2a\u8f6f\u4ef6\u3002<br \/>\n4. \u9274\u5b9ahelitrons: HELITRONFINDER\u3002\u8be5\u8f6f\u4ef6(<a href=\"http:\/\/www.biomedcentral.com\/1471-2164\/9\/51\" target=\"_blank\">Du et al., 2008<\/a>)\u7528\u6765\u5bfb\u627e\u7389\u7c73\u57fa\u56e0\u7ec4\u4e2d\u7684helitrons\uff08\u5728\u52a8\u7269\u548c\u690d\u7269\u4e2d\u6709\u53d1\u73b0\uff09\u3002<\/p>\n<h3>1.2.3 \u91cd\u5934\u9884\u6d4b\u7684\u8f6f\u4ef6<\/h3>\n<h3>1.2.3.1 \u81ea\u6211\u6bd4\u8f83\u7684\u65b9\u6cd5<\/h3>\n<p>\u901a\u8fc7BLAST\u3001PALS\u7b49\u65b9\u6cd5\uff0c\u5c06\u5e8f\u5217\u548c\u81ea\u8eab\u8fdb\u884c\u6bd4\u8f83\uff0c\u4ece\u800c\u627e\u51fa\u91cd\u590d\u5e8f\u5217\u3002\u8f6f\u4ef6\u6709\uff1aREPEAT PATTERN TOOLKIT (Agarwal and States, 1994), RECON (Bao and Eddy, 2002), PILER (Edgar and Myers, 2005) and the BLASTER suite (used in Quesneville et al., 2005).\u5176\u4e2dRECON\u8f6f\u4ef6\u4f7f\u7528\u6700\u5e7f\u6cdb\u3002<\/p>\n<h3>1.2.3.2 k-mer and spaced seed approaches<\/h3>\n<p>\u4e00\u5b9a\u957f\u5ea6\u7684k-mer\u51fa\u73b0\u4e86\u591a\u6b21\uff0c\u53ef\u4ee5\u9274\u5b9a\u4e3a\u91cd\u590d\u5e8f\u5217\uff1bspaced seed\u5219\u662fk-mer\u7684\u4e00\u79cd\u884d\u751f\uff0cseed\u4e0a\u5141\u8bb8\u6709\u4e00\u5b9a\u7684\u5dee\u5f02\u3002<\/p>\n<p>\u8f6f\u4ef6\u6709\uff1aREPUTER (Kurtz and Schleiermacher, 1999), VMATCH (Kurtz, unpublished), REPEAT-MATCH (Delcher et al., 1999), MER-ENGINE (Healy et al., 2003), FORREPEATS (Lefebvre et al., 2003), REAS (Li et al., 2005), REPEATSCOUT (Price et al., 2005), RAP (Campagna et al., 2005), REPSEEK (Achaz et al., 2007), TALLYMER (Kurtz et al., 2008) and P-CLOUDS (Gu et al., 2008).<\/p>\n<h3>1.2.4 \u5176\u5b83\u91cd\u590d\u5e8f\u5217\u9274\u5b9a\u8f6f\u4ef6<\/h3>\n<p>\u5176\u5b83\u4e00\u4e9b\u7528\u4e8e\u9274\u5b9a\u975eTEs\u7684\u8f6f\u4ef6\uff1aTandem Repeats Finder (TRF) (Benson, 1999), Tandem Repeat Occurrence Locator (TROLL) (Castelo et al., 2002), MREPS (Kolpakov et al., 2003), TRAP (Sobreira et al., 2006) and Optimized Moving Window Spectral Analysis (OMWSA) (Du et al., 2007) have been developed specifically to detect tandem repeats. The Inverted Repeat Finder (IRF) program (Warburton et al., 2004) was designed to search for inverted repeats.<\/p>\n<h2>1.3 \u591a\u4e2a\u8f6f\u4ef6\u6574\u5408\u7684pipeline\u7a0b\u5e8f<\/h2>\n<p>REPEATMODELER pipeline (Smit, unpublished http:\/\/www.repeatmasker.org\/RepeatModeler.html) includes the programs RECON, REPEATSCOUT, REPEATMASKER and TRF. It uses the output of the RECON and REPEATSCOUT programs to build, refine and classify consensus models of putative interspersed repeats. <\/p>\n<p>\u5f53\u7136\uff0c\u5728\u6b64\u6587\u732e\u4e2d\uff0c\u4e5f\u8bb2\u8ff0\u4e86\u5176\u5b83\u5f88\u591a\u4e13\u95e8\u7528\u9014\u7684\u5176\u5b83pipeline\u8f6f\u4ef6\u3002\u800cREPEATMODELER pipeline\u662f\u73b0\u5728\u8fd0\u7528\u4e8e\u57fa\u56e0\u7ec4\u7684\u91cd\u590d\u5e8f\u5217\u9274\u5b9a\u6700\u4e3b\u6d41\u7684\u8f6f\u4ef6\u3002\u4ee5\u4e0b\u5c06\u8bb2\u8ff0\u8be5\u8f6f\u4ef6\u8fd0\u7528\u3002<\/p>\n<h1>2. RepeatModeler\u7684\u5b89\u88c5\u4e0e\u4f7f\u7528<\/h1>\n<h2>2.1 \u8f6f\u4ef6\u7684\u5b89\u88c5<\/h2>\n<p><a href=\"http:\/\/www.repeatmasker.org\/\" target=\"_blank\">RepeatMasker<\/a>\u548c<a href=\"http:\/\/www.repeatmasker.org\/RepeatModeler.html\" target=\"_blank\">RepeatModeler<\/a>\u662fISB\uff08Institute for Systems Biology)\u7684\u8f6f\u4ef6\u3002ISB is located in the South Lake Union neighborhood\u3002<\/p>\n<p>\u6839\u636e<a href=\"http:\/\/www.repeatmasker.org\/RMDownload.html\" target=\"_blank\">RepeatMasker\u8bf4\u660e<\/a>\uff0c\u5176\u5b89\u88c5\u4e0e\u4f7f\u7528\u9700\u8981\uff1aPerl 5.8.0\u4ee5\u4e0a\u7248\u672c\uff0c\u5e8f\u5217\u6bd4\u5bf9Engine\uff0cTRF\u548cRepeat Database\u3002\u5176\u4e2d\u5e8f\u5217\u6bd4\u5bf9engine\u53ef\u4ee5\u5b89\u88c5\u591a\u4e2a\uff0c\u4f46\u6bcf\u6b21\u53ea\u80fd\u4f7f\u7528\u5176\u4e00\uff0c\u53ef\u4ee5\u4f7f\u7528Cross_match,RMBlast,HMMER\u548cABBlaast\/WUBlast\u7b49\u3002<\/p>\n<p>\u6839\u636e<a href=\"http:\/\/www.repeatmasker.org\/RepeatModeler.html\" target=\"_blank\">RepeatMideker\u8bf4\u660e<\/a>\uff0c\u5176\u5b89\u88c5\u4e0e\u4f7f\u7528\u9700\u8981\uff1aPerl 5.8.8\u4ee5\u4e0a\u7248\u672c\uff0cRepeatMasker,Repeat Database,RECON,RepeatScout,TRF\u548c\u5e8f\u5217\u6bd4\u5bf9engine\u3002\u5176\u4e2d\u5e8f\u5217\u6bd4\u5bf9engine\u53ef\u4ee5\u5b89\u88c5\u591a\u4e2a\uff0c\u4f46\u6bcf\u6b21\u53ea\u80fd\u4f7f\u7528\u5176\u4e00\uff0c\u53ef\u4ee5\u4f7f\u7528RMBlast\u548cABBlaast\/WUBlast\u3002<\/p>\n<p>\u518d\u5b89\u88c5\u5b8c\u6bd5\u9700\u8981\u7684\u8f6f\u4ef6\u540e\uff0c\u5bf9RepeatMasker\u548cRepeatModeler\u8fdb\u884cconfigure\u7684\u65f6\u5019\u586b\u5165\u76f8\u5e94\u8f6f\u4ef6\u7684\u8def\u5f84\u5373\u53ef\u5b89\u88c5\u3002<\/p>\n<h2>2.2 RepeatModeler\u7684\u4f7f\u7528<\/h2>\n<h3>2.2.1 \u4f7f\u7528RepeatModeler\u6765\u901a\u8fc7\u57fa\u56e0\u7ec4\u5e8f\u5217\u6784\u5efalibrary<\/h3>\n<pre>\r\n$ $RepeatModelerHome\/BuildDatabase -name species \\\r\n  -engine ncbi species.genome.fasta\r\n$ $RepeatModelerHome\/RepeatModeler -database species\r\n<\/pre>\n<p>\u7ed3\u679c\u751f\u6210\u4e86\u4e00\u4e2a\u6587\u4ef6\u5939\uff0c\u540d\u79f0\u4e3aRM_[PID].[DATE] ie. &#8220;RM_5098.MonMar141305172005&#8243;\u3002\u8be5\u6587\u4ef6\u5939\u4e2d\u7684&#8221;consensi.fa.classified&#8221;\u5373\u4e3alibrary\uff0c\u7528\u4e8eRepeatMasker\u7684\u8f93\u5165\u3002\u503c\u5f97\u6ce8\u610f\u7684\u662f\uff0c\u53ef\u80fdfasta\u6587\u4ef6\u9700\u8981\u5e8f\u5217\u95f4\u4e0d\u80fd\u6709\u7a7a\uff08\u7a7a\u683c\u548c\u6362\u884c\u7b49\uff09\uff0c\u5426\u5219\u4f1a\u7a0b\u5e8f\u51fa\u9519\u3002<\/p>\n<h3>2.2.2 \u4f7f\u7528RepeatMasker\u6765\u8fdb\u884c\u91cd\u590d\u5e8f\u5217\u63a9\u76d6\u548c\u91cd\u590d\u5e8f\u5217\u8ba1\u7b97<\/h3>\n<pre>\r\ncp RM_5098.MonMar141305172005\/consensi.fa.classified .\r\nmkdir Repeat_result\r\n$ $RepeatMaskerHome\/RepeatMasker -pa 8 \\\r\n  -e ncbi -lib consensi.fa.classified \\\r\n  -dir Repeat_result -gff species.genome.fasta\r\n<\/pre>\n<p>\u5219\u751f\u6210\u7684\u7ed3\u679c\u6587\u4ef6\u4f4d\u4e8eRepeat_result\u6587\u4ef6\u5939\u4e2d\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>1. \u91cd\u590d\u5e8f\u5217\u7b80\u4ecb\u548c\u76f8\u5173\u8f6f\u4ef6 \u53c2\u8003\u81ea\u6587\u732e\uff1aReview\uff1aIdentifying  &hellip; <a href=\"http:\/\/www.chenlianfu.com\/?p=1747\">\u7ee7\u7eed\u9605\u8bfb <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts\/1747"}],"collection":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1747"}],"version-history":[{"count":16,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts\/1747\/revisions"}],"predecessor-version":[{"id":1973,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts\/1747\/revisions\/1973"}],"wp:attachment":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1747"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1747"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1747"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}