Extractfeat options
Gary Williams, Tel 01223 494522
gwilliam at hgmp.mrc.ac.uk
Wed Apr 2 08:29:52 UTC 2003
It looks like you are doing:
extractfeat refseq:NC_001806 stdout -tag gene
This will pull out the features like:
gene 513..1259
/gene="RL1"
or
CDS 513..1259
/gene="RL1"
which include the tag name 'gene', e.g. /gene="RL1"
You should be using:
extractfeat refseq:NC_001806 stdout -type gene
which will only find the features like:
gene 513..1259
/gene="RL1"
which has the type name 'gene'
I'll add a report of specified tag values in the output description for
you soon, Burke.
Regards,
Gary
Burke Squires wrote:
>
> Hello All,
>
> I have having a bit of trouble extracting just genes form a Genbank file. I
> have tried the obviously options to no avail. I want to get JUST the gene
> information but I always get gene and CDS as below. How do I do that?
>
> Additionally, can I get the gene name instead of the stuff below?
>
> Thanks!
>
> Burke
>
> >NC_001806_513_1259 [gene] Human herpesvirus 1, complete genome.
> atggcccgccgccgccgccatcgcggcccccgccgcccccggccgcccgggcccacgggc
> gccgtcccaaccgcacagtcccaggtaacctccacgcccaactcggaacccgcggtcagg
> agcgcgcccgcggccgccccgccgccgccccccgccggtgggcccccgccttcttgttcg
> ctgctgctgcgccagtggctccacgttcccgagtccgcgtccgacgacgacgatgacgac
> gactggccggacagccccccgcccgagccggcgccagaggcccggcccaccgccgccgcc
> ccccggccccggcccccaccgcccggcgtgggcccggggggcggggctgacccctcccac
> cccccctcgcgccccttccgccttccgccgcgcctcgccctccgcctgcgcgtcaccgcg
> gagcacctggcgcgcctgcgcctgcgacgcgcgggcggggagggggcgccggagcccccc
> gcgacccccgcgacccccgcgacccccgcgacccccgcgacccccgcgcgggtgcgcttc
> tcgccccacgtccgggtgcgccacctggtggtctgggcctcggccgcccgcctggcgcgc
> cgcggctcgtgggcccgcgagcgggccgaccgggctcggttccggcgccgggtggcggag
> gccgaggcggtcatcgggccgtgcctggggcccgaggcccgtgcccgggccctggcccgc
> ggagccggcccggcgaactcggtctaa
> >NC_001806_513_1259 [CDS] Human herpesvirus 1, complete genome.
> atggcccgccgccgccgccatcgcggcccccgccgcccccggccgcccgggcccacgggc
> gccgtcccaaccgcacagtcccaggtaacctccacgcccaactcggaacccgcggtcagg
> agcgcgcccgcggccgccccgccgccgccccccgccggtgggcccccgccttcttgttcg
> ctgctgctgcgccagtggctccacgttcccgagtccgcgtccgacgacgacgatgacgac
> gactggccggacagccccccgcccgagccggcgccagaggcccggcccaccgccgccgcc
> ccccggccccggcccccaccgcccggcgtgggcccggggggcggggctgacccctcccac
> cccccctcgcgccccttccgccttccgccgcgcctcgccctccgcctgcgcgtcaccgcg
> gagcacctggcgcgcctgcgcctgcgacgcgcgggcggggagggggcgccggagcccccc
> gcgacccccgcgacccccgcgacccccgcgacccccgcgacccccgcgcgggtgcgcttc
> tcgccccacgtccgggtgcgccacctggtggtctgggcctcggccgcccgcctggcgcgc
> cgcggctcgtgggcccgcgagcgggccgaccgggctcggttccggcgccgggtggcggag
> gccgaggcggtcatcgggccgtgcctggggcccgaggcccgtgcccgggccctggcccgc
> ggagccggcccggcgaactcggtctaa
> >NC_001806_2261_2317 [gene] Human herpesvirus 1, complete genome.
> atggagccccgccccggagcgagtacccgccggcctgagggccgcccccagcgcgag
> >NC_001806_3083_3749 [gene] Human herpesvirus 1, complete genome.
> cccgccccggatgtctgggtgtttccctgcgaccgagacctgccggacagcagcgactct
> gaggcggagaccgaagtgggggggcggggggacgccgaccaccatgacgacgactccgcc
> tccgaggcggacagcacggacacggaactgttcgagacggggctgctggggccgcagggc
> gtggatgggggggcggtctcgggggggagccccccccgcgaggaagaccccggcagttgc
> gggggcgccccccctcgagaggacggggggagcgacgagggcgacgtgtgcgccgtgtgc
> acggatgagatcgcgccccacctgcgctgcgacaccttcccgtgcatgcaccgcttctgc
> atcccgtgcatgaaaacctggatgcaattgcgcaacacctgcccgctgtgcaacgccaag
> ctggtgtacctgatagtgggcgtgacgcccagcgggtcgttcagcaccatcccgatcgtg
> aacgacccccagacccgcatggaggccgaggaggccgtcagggcgggcacggccgtggac
> tttatctggacgggcaatcagcggttcgccccgcggtacctgaccctgggggggcacacg
> gtgagggccctgtcgcccacccacccggagcccaccacggacgaggatgacgacgacctg
> gacgacg
>
> --
> Burke Squires
> Bioinformatics
> MacroGenics, Inc.
> Dallas, TX
--
Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512
mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/
Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK
More information about the EMBOSS
mailing list