[Bioperl-l] Bio::DB::GenBank question (acc vs. version)

bill at genenformics.com bill at genenformics.com
Sun Sep 13 15:47:57 UTC 2009


I would like to make a few comments about get_Seq_by_version and
get_Seq_by_acc. Although both functions use the same NCBI eUtils API, they
are interpreted differently for a Seq_id with version or without version.

1. If the Seq_id has a version, GenBank ID server will locate
corresponding GI and emit the correct sequence.
2. If the Seq_id does not have a version, GBDataLoader  will try to find
the latest version number for that Seq_id, which is relatively slower and
the version number the ID server find out may NOT always be the latest.

IMHO, for both efficiency and consistency,
get_Seq_by_gi > get_Seq_by_version >> get_Seq_by_acc

Bill


>
> It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
> functionally identical.  They seem to trickle down to the same place
> and walking through these two requests yields almost identical http
> requests:
>
>   $db->get_Seq_by_version('J00522.1')
>   GET
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n
>
>   $db->get_Seq_by_acc('J00522')
>   GET
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n
>
> The only difference that I can see is that they index into different
> secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
> sections contain the same information.
>
> I'd like a general purpose tool that does The Right Thing whether
> there's a .1 on the end of an identifier or not, and am just trying to
> make sure I'm not doing something troublesome.
>
> Am I correct about the above?
>
> While I'm at it, I think that the comment
>
>   # note that get_Stream_by_version is not implemented
>
> in Bio::DB::GenBank was made obsolete by whoever commented out the
>
>   $self->throw(...)
>
> in get_Stream_by_version in Bio::WebDBSeqI.pm.
>
> I'll happily commit the trivial doc fix if no one shoots down the
> idea. (can't help big, might as well help small...).
>
> Thanks,
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>





More information about the Bioperl-l mailing list