[Bioperl-l] Genbank Bioperl problem

Ali Al-Shahib alshahib@dcs.gla.ac.uk
Thu, 13 Jun 2002 16:00:28 +0100 (BST)


Hi Brian

Genpept seemed to work with me, but I had to use 'id' instead of 'acc' so:
my $seq = $gb->get_Seq_by_id('NP_457465.1');
I also tried RefSeq, but it doesn't work.

However, now I've faced another problem.  I wanted to use the batch (my 
$seq = $gb->get_Seq_by_batch($filename)) but Genpept doesn't support this. 
Have you any ideas how I can solve this problem, because I have alot of 
NP's I need fetching from NCBI, and its impossible for me to do them 
without a batch. 

Thank you in advance.

Ali 

On Thu, 13 Jun 2002, Brian Osborne wrote:

> Ali and Stefan,
> 
> Accession numbers starting with NP_ are Genbank RefSeq entries (see
> http://www.ncbi.nlm.nih.gov/LocusLink/RSfaq.html). From the Bioperl FAQ:
> 
>   Q2.3: How can I get NT_ or NM_ accessions from NCBI (Reference
> 	Sequences)?
> 
> 	Use Bio::DB::RefSeq not Bio::DB::GenBank when you are retrieving
> 	the NM_ accessions. This is still an area of active development
> 	because the data providers have not provided the best interface for
> 	us to query.  EBI has provided a mirror with their dbfetch system
> 	which is accessible through the Bio::DB::RefSeq object however,
> 	there are cases where NT_ accessions will not be retrievable.
> 
> Bio::DB::GenPept won't work, and a one-liner using Bio::DB::RefSeq seemed to
> work. I'll change the FAQ so that it refers to NP_'s as well.
> 
> Brian O.
> 
> -----Original Message-----
> From: bioperl-l-admin@bioperl.org [mailto:bioperl-l-admin@bioperl.org]On
> Behalf Of Stefan A Kirov
> Sent: Wednesday, June 12, 2002 2:59 PM
> To: Ali Al-Shahib
> Cc: Bioperl
> Subject: Re: [Bioperl-l] Genbank Bioperl help
> 
> Use Bio::DB::GenPept for proteins!
> Good luck!
> Stefan
> 
> On Wed, 12 Jun 2002, Ali Al-Shahib wrote:
> 
> >
> >Hi
> >
> >I've got a set of accession numbers but they start with 'NP_' as they are
> >proteins.  I've used the genbank module (Bio::DB::GenBank) and produced
> >the following script:
> >
> >#!/usr/local/bin/perl -w
> >
> >use Bio::DB::GenBank;
> >use Bio::Species;
> >my $gb = new Bio::DB::GenBank;
> >
> >#get a particular accession number
> >my $seq = $gb->get_Seq_by_acc('NP_347647');
> >
> >#get the sepecies from the 'sequence' object
> >my $sp = $seq->species();
> >
> >#get the classification
> >my @class = $sp->classification();
> >
> >#print out the result, line by line
> >print join ("\n", @class), "\n";
> >
> >However it works for accssion numbers for nucleotide sequences but not of
> >protien sequences.  How can I change the script to make it fetch the
> >organsim name from genbank using the protein accession number which starts
> >with 'NP_' (example: NP_347647.1).  It fetches accession numbers like
> >AC021953, but not 'NP_.....'.
> >
> >I would greatly appreciate it if you can answer my query.
> >
> >Thank you in advance
> >
> >Ali
> >--
> >Mr Ali Al-Shahib
> >Research Student
> >Bioinformatics Research Centre
> >Department of Computing Science
> >17 Lilybank Gardens
> >University of Glasgow
> >Glasgow G12 8QQ
> >Scotland, UK
> >Tel: 0141 330 2421 (direct)
> >E-mail: alshahib@dcs.gla.ac.uk
> >Web page: http://www.dcs.gla.ac.uk/~alshahib
> >
> >
> >
> >
> >
> >
> >_______________________________________________
> >Bioperl-l mailing list
> >Bioperl-l@bioperl.org
> >http://bioperl.org/mailman/listinfo/bioperl-l
> >
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
> 
> 
> 

-- 
Mr Ali Al-Shahib
Research Student
Bioinformatics Research Centre
Department of Computing Science
17 Lilybank Gardens
University of Glasgow
Glasgow G12 8QQ
Scotland, UK
Tel: 0141 330 2421 (direct) 
E-mail: alshahib@dcs.gla.ac.uk
Web page: http://www.dcs.gla.ac.uk/~alshahib