Bioperl: Bio::NCBI

Lewis Geer lewisg@ncbi.nlm.nih.gov
Thu, 7 May 1998 14:38:26 -0400 (EDT)


Thanks, guys, for the show of support!  Roland Walker (who has been doing
most of the perl code in the wrappers) and I want to get out an alpha as
soon as possible.  What we have right now is an asn.1 file -> hash of hashes 
parser.  An example of a sequence entry fragment:
 
    'id' => {
      'gi' => 123456,
      'swissprot' => {
        'accession' => 'P28689',
        'name' => 'HOLE_ECOLI'
      }
    },
    'inst' => {
      'hist' => {
        'replaced-by' => {
          'ids' => {
            'gi' => 399924
          },
          'date_std' => {
            'year' => 1993,
            'day' => 14,
            'month' => 9
          }
        }
      },
      'mol' => 3,
      'repr' => 2,
      'length' => 76,
      'seq-data_iupacaa' => 
'MLKNLAKLDQTEMDKVNVDLAAAGVAFKERYNMPVIAEAVEREQPEHLRSW
FRERLIAHRLASVNLARLPYEPKLK'
    }

We also are working on a blast 2 sequences function.

Comments on your suggestions:

- merging with the BioPerl framework:  first we need to expose the guts of
the toolkit to Perl, which is a bunch of work.  After that we can
understand how merging could work, but for now separate is probably
easier. 

- ncbi will probably move most of its query capabilities to pure http over 
the next year, so I'm not sure how the library functions will add any 
enhancements to Perl over using straight http queries.  The process would be:

perl http query --> ncbi --> asn.1 --> hash of hashes

It does mean that the http queries need to be better standardized and that
the returned information be in a standard format.  There definitely is
some work to be done on that, especially with Blast -- at a minimum Blast
ought to return the asn.1 it uses internally to represent alignments. 

- Ewan's comment on SWIG: yep, had the same problems you did, so I opted 
to use XS.  Cool, program, though.

Thanks again,
Lewis


On Thu, 7 May 1998, bpollock wrote:

> 
> Hello-
> 
> On Wed, 6 May 1998, Lincoln Stein wrote:
> 
> > Is anyone working on such a project?  If not, I'd like to propose a
> > new Bio::NCBI namespace for these modules.  Also, there appears to be
> > some disagreement at NCBI on whether such an effort is useful
> > (unredeemable C programmers dominate there, and they have yet to
> > appreciat the beauty of Perl).  
> 
> We have a crude BLAST parser that we have used for a large validation
> effort, but nothing that provides native access.  This kind of module
> would greatly enhance several of our future efforts that we have
> scheduled. 
>  
> > If people think this project is a good
> > idea, it would be helpful to make some enthusiastic noise about it so
> > that Lewis can have something to go to his superiors with when 
> > argues for moving the project up in the priority list.
> 
> I will agree that the BLAST and pubmed retrievals would be a fantastic 
> tool to have.  We are also very excited to hear about this potential. 
> 
> Brian
> 
> ------------------------------------------------------
> Brian K. Pollock
> BioInformatics Group
> Research Genetics Inc.
> (800)533-4363 X2266
> bpollock@resgen.com
> http://www.resgen.com
> -------------------------------------------------------
> 
> 
> =========== Bioperl Project Mailing List Message Footer =======
> Project URL: http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/
> For info about how to (un)subscribe, where messages are archived, etc:
> http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
> ====================================================================
> 


=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================