[Bioperl-l] Extracting gi no from refseq record

Jason Stajich jason at cgt.mc.duke.edu
Thu Apr 3 10:20:04 EST 2003


It should have been in $seq->primary_id() - but we only pick up GI from
the field here, I assume the file in question has this line?

I notice we don't try and parse NID lines from Genbank.

VERSION     AI129902.1  GI:3598416

<rant>
I will again, futher put out a call for someone to write a set of
comprehensive tests for the sequence parsers to test that we properly
parse out all the fields into the appropriate Bio::Seq/Bio::Seq::RichSeq
objects from the rich genbank, embl, and swissprot formats.  (This is the
part where all the people who want to join in on the project but aren't
sure where to help volunteer).  Just write a test in the style of
t/SeqIO.t but include more richly annotated files, files with TPA
annotation, and test that all the information is properly available in the
parser created Seq object.

I would advocate that sequence parsing get a nice code review during the
1.3 developer series this summer.  This may or may not be the time for
event-based parsing to be implemented too - I hope it we will be able to
begin this implementation as well.
</rant>

-jason

On Thu, 3 Apr 2003, Siddhartha Basu wrote:

> Hi,
> I am trying to extract the gi no from the refseq flat files that is in
> genbank format. This is what i have done so far...
>
> *** Indexed the files with Bio::Index::GenBank module
> *** Then try to fetch a particular entry by get_Seq_by_acc/id call.
> *** It returns a Bio::Seq::RichSeq object.
> *** Now i have tried to get the gi no by
> *** $Seq->primary_id(), $Seq->display_id() calls but both of them return
> the locus name. Since $Seq is a RichSeq object i have also tried with
> $Seq->pid() call but nothing happens.
>
>
> So, how is it possible to extract that number for a particular NP or NM
> refseq number specially from flat files.
>
>
> siddhartha
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list