{"id":1224,"date":"2013-05-07T06:02:00","date_gmt":"2013-05-06T22:02:00","guid":{"rendered":"http:\/\/www.hzaumycology.com\/chenlianfu_blog\/?p=1224"},"modified":"2013-05-08T00:21:19","modified_gmt":"2013-05-07T16:21:19","slug":"augustus%e7%9a%84%e5%ae%89%e8%a3%85%e5%92%8c%e4%bd%bf%e7%94%a8%e6%96%b9%e6%b3%95","status":"publish","type":"post","link":"http:\/\/www.chenlianfu.com\/?p=1224","title":{"rendered":"Augustus\u7684\u5b89\u88c5\u548c\u4f7f\u7528\u53c2\u6570"},"content":{"rendered":"<p><a href=\"http:\/\/bioinf.uni-greifswald.de\/augustus\/\" target=\"_blank\">AUGUSTUS<\/a> is a program that predicts genes in eukaryotic genomic sequences.<\/p>\n<h1>1. Augustus\u7684\u5b89\u88c5<\/h1>\n<p>Augustus\u4e0b\u8f7d\uff1a<a href=\"http:\/\/bioinf.uni-greifswald.de\/augustus\/binaries\/\" target=\"_blank\">http:\/\/bioinf.uni-greifswald.de\/augustus\/binaries\/<\/a><\/p>\n<pre>$ wget http:\/\/bioinf.uni-greifswald.de\/augustus\/binaries\/augustus.2.7.tar.gz\r\n$ tar zxf augustus.2.7.tar.gz\r\n$ cd augustus.2.7\r\n$ cd src\r\n$ make -j 8\r\n$ export AUGUSTUS_CONFIG_PATH=$PWD\/..\/config\/ (\u53ef\u4ee5\u52a0\u5165\u5230.bashrc\u4e2d\uff09<\/pre>\n<h1>2. Augustus\u4f7f\u7528\u65b9\u6cd5<\/h1>\n<h2>2.1 \u57fa\u56e0\u9884\u6d4b\u4f8b\u5b50<\/h2>\n<pre>\r\n$ augustus --strand=both --genemode=partial --singlestrand=false --hintsfile=hints.gff --extrinsicCfgFile=extrinsic.cfg --protein=on --introns=on --start=on --stop=on --cds=on --codingseq=on --alternatives-from-evidence=true --gff3=on --UTR=on ----outfile=out.gff --species=human genome.fa\r\n$ augustus --noprediction=true --species=SPECIES sequences.gb<\/pre>\n<h2>2.2 Augustus\u4f7f\u7528\u53c2\u6570<\/h2>\n<p>Usage:<\/p>\n<pre>augustus [parameters] --sepcies=SPECIES queryfilename<\/pre>\n<p>\u91cd\u8981\u53c2\u6570\uff1a<\/p>\n<pre><span style=\"color: #ff00ff;\">--strand=both,<\/span> --strand=forward or --strand=backward\r\n    report predicted genes on both strands, just the forward or \r\njust the backward strand.default is 'both'\r\n\r\n<span style=\"color: #ff00ff;\">--genemodel=partial, --genemodel=intronless, --genemodel=complete,<\/span> \r\n<span style=\"color: #ff00ff;\">--genemodel=atleastone or --genemodel=exactlyone<\/span>\r\n\u00a0 \u00a0 partial : allow prediction of incomplete genes at the sequence boundaries (default)\r\n\u00a0 \u00a0 intronless : only predict single-exon genes like in prokaryotes and some eukaryotes\r\n\u00a0 \u00a0 complete : only predict complete genes\r\n\u00a0 \u00a0 atleastone : predict at least one complete gene\r\n\u00a0 \u00a0 exactlyone : predict exactly one complete gene\r\n\r\n<span style=\"color: #ff00ff;\">--singlestrand=true<\/span>\r\n\u00a0 \u00a0 predict genes independently on each strand, allow overlapping\r\n genes on opposite strands. This option is turned off by default.\r\n\r\n<span style=\"color: #ff00ff;\">--hintsfile=hintsfilename<\/span>\r\n\u00a0 \u00a0 When this option is used the prediction considering hints (ex\r\ntrinsic information) is turned on. hintsfilename contains the hints\r\n in gff format.\r\n\r\n<span style=\"color: #ff00ff;\">--extrinsicCfgFile=cfgfilename<\/span>\r\n\u00a0 \u00a0 Optional. This file contains the list of used sources for the \r\nhints and their boni and mali. If not specified the file \"extrin\r\nsic.cfg\" in the config directory $AUGUSTUS_CONFIG_PATH is used.\r\n\r\n--maxDNAPieceSize=n\r\n\u00a0 \u00a0 This value specifies the maximal length of the pieces that the \r\nsequence is cut into for the core algorithm (Viterbi) to be run. \r\nDefault is --maxDNAPieceSize=200000.\r\n\u00a0 \u00a0 AUGUSTUS tries to place the boundaries of these pieces in the \r\nintergenic region, which is inferred by a preliminary prediction. \r\nGC-content dependent parameters are chosen for each piece of DNA \r\nif \/Constant\/decomp_num_steps &gt; 1 for that species. This is why \r\nthis value should not be set very large, even if you have plenty \r\nof memory.\r\n\r\n<span style=\"color: #ff00ff;\">--protein=on\/off\r\n--introns=on\/off\r\n--start=on\/off\r\n--stop=on\/off\r\n--cds=on\/off\r\n--codingseq=on\/off<\/span>\r\n\u00a0 \u00a0 Output options. Output predicted protein sequence, introns, \r\nstart codons, stop codons. Or use 'cds' in addition to 'initial', \r\n'internal', 'terminal' and 'single' exon. The CDS excludes the \r\nstop codon (unless stopCodonExcludedFromCDS=false) whereas the \r\nterminal and single exon include the stop codon.\r\n\r\n--AUGUSTUS_CONFIG_PATH=path\r\n\u00a0 \u00a0 path to config directory (if not specified as environment var\r\niable)\r\n\r\n<span style=\"color: #ff00ff;\">--alternatives-from-evidence=true\/false<\/span>\r\n\u00a0 \u00a0 report alternative transcripts when they are suggested by hints\r\n\r\n<span style=\"color: #ff00ff;\">--alternatives-from-sampling=true\/false<\/span>\r\n\u00a0 \u00a0 report alternative transcripts generated through probabilistic \r\nsampling\r\n\r\n--sample=n\r\n--minexonintronprob=p\r\n--minmeanexonintronprob=p\r\n--maxtracks=n\r\n\r\n--proteinprofile=filename\r\nRead a protein profile from file filename. See section 7 below.\r\n\r\n--predictionStart=A, --predictionEnd=B\r\n\u00a0 \u00a0 A and B define the range of the sequence for which predictions \r\nshould be found. Quicker if you need predictions only for a small \r\npart.\r\n\r\n<span style=\"color: #ff00ff;\">--gff3=on\/off<\/span>\r\n\u00a0 \u00a0 output in gff3 format.\r\n\r\n--UTR=on\/off\r\n\u00a0 \u00a0 predict the untranslated regions in addition to the coding \r\nsequence. This currently works only for human, galdieria, toxopl\r\nasma and caenorhabditis.\r\n\r\n<span style=\"color: #ff00ff;\">--outfile=filename<\/span>\r\n\u00a0 \u00a0 print output to filename instead to standard output. This is \r\nuseful for computing environments, e.g. parasol jobs, which do \r\nnot allow shell redirection.\r\n\r\n--noInFrameStop=true\/false\r\n\u00a0 \u00a0 Don't report transcripts with in-frame stop codons. Otherwise, \r\nintron-spanning stop codons could occur. Default: false\r\n\r\n<span style=\"color: #ff00ff;\">--noprediction=true\/false<\/span>\r\n\u00a0 \u00a0 If true and input is in genbank format, no prediction is made. \r\nUseful for getting the annotated protein sequences. Augustus\u4e5f\u53ef\u4ee5\u4ee5\r\ngenebank\u683c\u5f0f\u6587\u4ef6\u4e3a\u8f93\u5165\u6587\u4ef6\uff0c\u8fdb\u884c\u57fa\u56e0\u9884\u6d4b\uff0c\u5e76\u5c06\u9884\u6d4b\u7ed3\u679c\u548cgenebank\u7684\u7ed3\u679c\u8fdb\u884c\u6bd4\u8f83\u540e\r\n\u5f97\u51fa\u4e00\u4e2a\u7cbe\u786e\u6027\u7684\u7edf\u8ba1\u7ed3\u679c\u3002\r\n\u00a0 \u00a0 \u5f53\u7136\uff0c\u7531\u4e8egenebank\u683c\u5f0f\u6587\u4ef6\u4e2d\u6709\u4e9bsequences\u6ca1\u6709cds\u7684\u6ce8\u91ca\u7ed3\u679c\uff0c\u56e0\u6b64\u53ef\u4ee5\u4f7f\u7528\u8be5\r\n\u53c2\u6570\u8fdb\u884c\u68c0\u6d4b\uff0c\u4ece\u800c\u5f97\u5230\u6ca1\u6709cds\u7684\u5e8f\u5217\u53f7\uff0c\u5728\u4eba\u4e3a\u53bb\u53bb\u9664\u8fd9\u4e9b\u6ca1\u6709cds\u6ce8\u91ca\u7684\u5e8f\u5217\uff0c\u518d\u53bb\u8fdb\u884c\r\n\u9884\u6d4b\u51c6\u786e\u6027\u7684\u8bc4\u4f30\u3002\r\n\r\n--contentmodels=on\/off\r\n\u00a0 \u00a0 If 'off' the content models are disabled (all emissions unif\r\normly 1\/4). The content models are; coding region Markov chain \r\n(emiprobs), initial k-mers in coding region (Pls), intron and int\r\nergenic regin Markov chain. This option is intended for special \r\napplications that require judging gene structures from the signal \r\nmodels only, e.g. for predicting the effect of SNPs or mutations \r\non splicing. For all typical gene predictions, this should be \r\ntrue. Default: on\r\n\r\n--paramlist\r\n    For a complete list of parameters, type \"augustus --paramlist\"<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>AUGUSTUS is a program that predicts gene &hellip; <a href=\"http:\/\/www.chenlianfu.com\/?p=1224\">\u7ee7\u7eed\u9605\u8bfb <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[],"tags":[],"_links":{"self":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts\/1224"}],"collection":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1224"}],"version-history":[{"count":14,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts\/1224\/revisions"}],"predecessor-version":[{"id":1251,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=\/wp\/v2\/posts\/1224\/revisions\/1251"}],"wp:attachment":[{"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1224"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1224"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.chenlianfu.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1224"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}