[Bioperl-l] How do you create a genbank file?
Jason Stajich
jason at cgt.duhs.duke.edu
Fri Oct 17 08:27:27 EDT 2003
If you are parsing Fasta files with properly set NCBI headers I thought we
had added a way to make this all get set properly, perhaps not...
You can set GI number with
$seq->primary_id($ginumber);
If you want to be creating RichSeq instead of Bio::Seq objects (in the
event you want to set some fields which are only available in RichSeq
objects, initialize your Bio::SeqIO fasta parser like this:
use Bio::SeqIO;
use Bio::Seq::SeqFactory;
my $seqio = new Bio::SeqIO(-format => 'fasta',
-file => $file,
-seqfactory => new Bio::Seq::SeqFactory
( -type => 'Bio::Seq::RichSeq'));
(Or alternatively you can set the seqfactory after you have initialized
the SeqIO object with
$seqio->sequence_factory(new Bio::Seq::SeqFactory(-type =>
'Bio::Seq::RichSeq'));
-jason
On Fri, 17 Oct 2003, Marc Logghe wrote:
> > My question is how do I set the following?
> >
> > mRNA (instead of dna)
> > MAM (instead of UNK)
> > VERSION AB050006.1 GI:26453358 <- I can't get
> > this line to appear
> > SOURCE Bos taurus (cow)
> > ORGANISM Bos taurus
> >
> >
> To set the version you should use:
> $seq->seq_version($version); # $version is e.g. 1
> Problem is, it is not possible to set the GI number. As far as I know, when you pass a genbank file, Bio::SeqIO does not even parse it, at least it does not show up when you Data::Dump the resulting Bio::Seq::RichSeq object.
> There is no slot for that information, because it does not exist in e.g. an EMBL sequence record.
> Concerning the organism, first create the Bio::Species object. In case you only have the string 'Bos taurus' in your fasta, of course you can not generate the full classification. At least not using only your fasta data.
> my $species = Bio::Species->new([reverse split /\s/, $organism]);
> $seq->species($species);
>
> gives you:
> SOURCE Bos taurus
> ORGANISM Bos taurus
> Bos.
>
> HTH,
> Marc
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
More information about the Bioperl-l
mailing list