Extractfeat options

Gary Williams, Tel 01223 494522 gwilliam at hgmp.mrc.ac.uk
Wed Apr 2 08:29:52 UTC 2003


It looks like you are doing:
extractfeat refseq:NC_001806 stdout -tag gene

This will pull out the features like:

     gene            513..1259
                     /gene="RL1"
 
or

     CDS             513..1259
                     /gene="RL1"

which include the tag name 'gene', e.g. /gene="RL1"

You should be using:
extractfeat refseq:NC_001806 stdout -type gene

which will only find the features like:

     gene            513..1259
                     /gene="RL1"

which has the type name 'gene'

I'll add a report of specified tag values in the output description for
you soon, Burke.

Regards,
Gary

Burke Squires wrote:
> 
> Hello All,
> 
> I have having a bit of trouble extracting just genes form a Genbank file. I
> have tried the obviously options to no avail. I want to get JUST the gene
> information but I always get gene and CDS as below. How do I do that?
> 
> Additionally, can I get the gene name instead of the stuff below?
> 
> Thanks!
> 
> Burke
> 
> >NC_001806_513_1259 [gene] Human herpesvirus 1, complete genome.
> atggcccgccgccgccgccatcgcggcccccgccgcccccggccgcccgggcccacgggc
> gccgtcccaaccgcacagtcccaggtaacctccacgcccaactcggaacccgcggtcagg
> agcgcgcccgcggccgccccgccgccgccccccgccggtgggcccccgccttcttgttcg
> ctgctgctgcgccagtggctccacgttcccgagtccgcgtccgacgacgacgatgacgac
> gactggccggacagccccccgcccgagccggcgccagaggcccggcccaccgccgccgcc
> ccccggccccggcccccaccgcccggcgtgggcccggggggcggggctgacccctcccac
> cccccctcgcgccccttccgccttccgccgcgcctcgccctccgcctgcgcgtcaccgcg
> gagcacctggcgcgcctgcgcctgcgacgcgcgggcggggagggggcgccggagcccccc
> gcgacccccgcgacccccgcgacccccgcgacccccgcgacccccgcgcgggtgcgcttc
> tcgccccacgtccgggtgcgccacctggtggtctgggcctcggccgcccgcctggcgcgc
> cgcggctcgtgggcccgcgagcgggccgaccgggctcggttccggcgccgggtggcggag
> gccgaggcggtcatcgggccgtgcctggggcccgaggcccgtgcccgggccctggcccgc
> ggagccggcccggcgaactcggtctaa
> >NC_001806_513_1259 [CDS] Human herpesvirus 1, complete genome.
> atggcccgccgccgccgccatcgcggcccccgccgcccccggccgcccgggcccacgggc
> gccgtcccaaccgcacagtcccaggtaacctccacgcccaactcggaacccgcggtcagg
> agcgcgcccgcggccgccccgccgccgccccccgccggtgggcccccgccttcttgttcg
> ctgctgctgcgccagtggctccacgttcccgagtccgcgtccgacgacgacgatgacgac
> gactggccggacagccccccgcccgagccggcgccagaggcccggcccaccgccgccgcc
> ccccggccccggcccccaccgcccggcgtgggcccggggggcggggctgacccctcccac
> cccccctcgcgccccttccgccttccgccgcgcctcgccctccgcctgcgcgtcaccgcg
> gagcacctggcgcgcctgcgcctgcgacgcgcgggcggggagggggcgccggagcccccc
> gcgacccccgcgacccccgcgacccccgcgacccccgcgacccccgcgcgggtgcgcttc
> tcgccccacgtccgggtgcgccacctggtggtctgggcctcggccgcccgcctggcgcgc
> cgcggctcgtgggcccgcgagcgggccgaccgggctcggttccggcgccgggtggcggag
> gccgaggcggtcatcgggccgtgcctggggcccgaggcccgtgcccgggccctggcccgc
> ggagccggcccggcgaactcggtctaa
> >NC_001806_2261_2317 [gene] Human herpesvirus 1, complete genome.
> atggagccccgccccggagcgagtacccgccggcctgagggccgcccccagcgcgag
> >NC_001806_3083_3749 [gene] Human herpesvirus 1, complete genome.
> cccgccccggatgtctgggtgtttccctgcgaccgagacctgccggacagcagcgactct
> gaggcggagaccgaagtgggggggcggggggacgccgaccaccatgacgacgactccgcc
> tccgaggcggacagcacggacacggaactgttcgagacggggctgctggggccgcagggc
> gtggatgggggggcggtctcgggggggagccccccccgcgaggaagaccccggcagttgc
> gggggcgccccccctcgagaggacggggggagcgacgagggcgacgtgtgcgccgtgtgc
> acggatgagatcgcgccccacctgcgctgcgacaccttcccgtgcatgcaccgcttctgc
> atcccgtgcatgaaaacctggatgcaattgcgcaacacctgcccgctgtgcaacgccaag
> ctggtgtacctgatagtgggcgtgacgcccagcgggtcgttcagcaccatcccgatcgtg
> aacgacccccagacccgcatggaggccgaggaggccgtcagggcgggcacggccgtggac
> tttatctggacgggcaatcagcggttcgccccgcggtacctgaccctgggggggcacacg
> gtgagggccctgtcgcccacccacccggagcccaccacggacgaggatgacgacgacctg
> gacgacg
> 
> --
> Burke Squires
> Bioinformatics
> MacroGenics, Inc.
> Dallas, TX

-- 
Gary Williams               Tel: +44 1223 494522  Fax: +44 1223 494512
mailto:G.Williams at hgmp.mrc.ac.uk            http://www.hgmp.mrc.ac.uk/
Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK



More information about the EMBOSS mailing list