[Bioperl-l] Fetching Fasta seqs from GenBank - Help request
sanges at biogem.it
sanges at biogem.it
Sat Mar 27 08:12:00 EST 2004
:
Alberto,
you have an error in your code:
my $query_string = ('Bothrops[Organism] AND
ribosomal','Bothrops[Organism] AND mitochondrial');
with this line you are putting an array into string,
try to add this line
print $query_string
and see: you have only the last value in your query_string!
If I understood well your need you should use a quesry like this:
my $query_string = 'Bothrops[Organism] AND (ribosomal OR mitochondrial)';
Remo
Quoting Alberto Davila <davila at ioc.fiocruz.br>:
> Hi Sean,
>
> Thanks for your valuable help !
>
> I solved the problem using "Bio::DB::Query::GenBank", my goal was to
> retrieve 2 types of sequences (mitochondrial and ribosomal) from
> specific organism (eg Bothrops spp)... I am listing my script for those
> interested to do something similar.. the only warning I get is:
>
> [davila at tryps script]$ perl fetch2contaminant.pl
> Useless use of a constant in void context at fetch2contaminant.pl line
> 10.
>
> I was not sure in which field (eg keyword or feature) I should look for
> ribosomal and mitochondrial genes, but leaving blank gave some good
> results.
>
> Indeed Bioperl is powerful... a bit confusing for beginners too.
>
> Thanks and best regards,
>
> Alberto
>
>
> #!/usr/local/bin/perl -w
>
> use lib "/usr/local/bioperl14";
> use strict;
> use Bio::DB::Query::GenBank;
> use Bio::SeqIO;
> use Bio::DB::GenBank;
>
>
> my $query_string = ('Bothrops[Organism] AND
> ribosomal','Bothrops[Organism] AND mitochondrial');
> my $query = new Bio::DB::Query::GenBank(-db=>'nucleotide',
> -query=>$query_string,
> -mindate => '1985',
> -maxdate => '2004');
>
> my $seqio=new Bio::DB::GenBank->get_Stream_by_query($query);
>
> #open a seqio handle for writing the outputfile in fasta
> my $outfile = new Bio::SeqIO(-format=>'fasta',
> -file=>'>contaminant.bothrops');
>
> while (my $s = $seqio->next_seq) {
>
> #write the fasta
> $outfile->write_seq($s);
>
> }
>
>
> exit;
>
>
>
>
>
>
>
> On Thu, 2004-03-25 at 16:37, Sean Davis wrote:
> > Alberto,
> >
> > I would second that. If are doing more with this than retrieving raw
> > sequence (if you care at all), maybe you could let Barry and I know what
> you
> > are trying to do more generally. Bioperl is quite powerful, but it does
> > take some direction to get started.
> >
> > Sean
> >
> > On 3/25/04 12:43 PM, "Barry Moore" <barry.moore at genetics.utah.edu>
wrote:
> >
> > > Alberto-
> > >
> > > You said, "the 'get_Stream_by_id' is returning me more than the
> > > 'sequence per se'". I'm not sure if this is what your asking, but
I'll
> > > take a shot. Since your are retrieving your two sequences in EMBL
> > > format, you get all the associated information that you would see if
you
> > >
> > > downloaded that same file from the web interface. Your sequences are
> > > stored by BioPerl as RichSeq objects which inherits a PrimarySeq
> > > objects. So that EMBL file data is stored in the RichSeq object and
the
> > >
> > > associated PrimarySeq object it inherited. Of course when you save
> > > that locally as a fasta file, that extra information is lost. If you
> > > decide you need to use that data have a look at the documentation for
> > > Bio::Seq::RichSeq and Bio::PrimarySeq and the SeqIO and Feature
> > > Annotation HOW TOs to learn more.
> > >
> > > Barry
> > >
> > > Alberto Davila wrote:
> > >
> > >> Thanks Jason,
> > >>
> > >> I installed the IO::String, then it is working fine now. However I
have
> > >> a doubt, the "get_Stream_by_id" is returning me more than the
"sequence
> > >> per se", what is it ? My script and results are listed below. Finally
I
> > >> would like to save (in my local disk) the retrieved sequences as
fasta
> > >> files... is there any argument for that ?
> > >>
> > >> Thanks again, Alberto
> > >>
> > >>
> > >> #!/usr/local/bin/perl -w
> > >>
> > >> use lib "/usr/local/bioperl14";
> > >> use Bio::DB::BioFetch;
> > >> use strict;
> > >> use Bio::DB::WebDBSeqI;
> > >> use HTTP::Request::Common 'POST';
> > >>
> > >> my $format_type='fasta';
> > >> my $stream;
> > >>
> > >>
> > >> my $bf = new Bio::DB::BioFetch(-format =>$format_type,
> > >> -retrievaltype =>'tempfile',
> > >> -db =>'EMBL');
> > >>
> > >> $stream = $bf->get_Stream_by_id(['BUM','J00231']);
> > >> while (my $s = $stream->next_seq) {
> > >> print $s->seq,"\n\n\n";
> > >> }
> > >>
> > >>
> > >> exit;
> > >>
> > >>
> > >>
> > >>
> > >> [davila at tryps script]$ perl gb-fetch-1.pl
> > >>
agtagtgtactaccaagtatagataacgtttaaatattaaagttttggatcaaagccaaagatgattcgca
> > > t
> > >>
gctggtgctgattgtagttacagctgcaagcccagtgtatcagagatgtttccaagatggggctatagtga
> > > a
> > >> gcaaaacccatccaaagaggcagtcacagaagtgtccctaaaagatgatgttagca
> > >>
> > >
> > >>
> > >
> > >>
cctggacctcctgtgcaagaacatgaaacanctgtggttcttccttctcctggtggcagctcccagatggg
> > > t
> > >>
cctgtcccaggtgcacctgcaggagtcgggcccaggactggggaagcctccagagctcaaaaccccacttg
> > > g
> > >>
tgacacaactcacacatgcccacggtgcccagagcccaaatcttgtgacacacctcccccgtgcccacggt
> > > g
> > >>
cccagagcccaaatcttgtgacacacctcccccatgcccacggtgcccagagcccaaatcttgtgacacac
> > > c
> > >>
tcccccgtgcccnnngtgcccagcacctgaactcttgggaggaccgtcagtcttcctcttccccccaaaac
> > > c
> > >>
caaggatacccttatgatttcccggacccctgaggtcacgtgcgtggtggtggacgtgagccacgaagacc
> > > c
> > >>
nnnngtccagttcaagtggtacgtggacggcgtggaggtgcataatgccaagacaaagctgcgggaggagc
> > > a
> > >>
gtacaacagcacgttccgtgtggtcagcgtcctcaccgtcctgcaccaggactggctgaacggcaaggagt
> > > a
> > >>
caagtgcaaggtctccaacaaagccctcccagcccccatcgagaaaaccatctccaaagccaaaggacagc
> > > c
> > >>
cnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnngaggagatgaccaagaaccaagtcagcctgacct
> > > g
> > >>
cctggtcaaaggcttctaccccagcgacatcgccgtggagtgggagagcaatgggcagccggagaacaact
> > > a
> > >>
caacaccacgcctcccatgctggactccgacggctccttcttcctctacagcaagctcaccgtggacaaga
> > > g
> > >>
caggtggcagcaggggaacatcttctcatgctccgtgatgcatgaggctctgcacaaccgctacacgcaga
> > > a
> > >>
gagcctctccctgtctccgggtaaatgagtgccatggccggcaagcccccgctccccgggctctcggggtc
> > > g
> > >>
cgcgaggatgcttggcacgtaccccgtgtacatacttcccaggcacccagcatggaaataaagcacccagc
> > > g
> > >> ctgccctgg
> > >>
> > >>
> > >>
> > >>
> > >> On Tue, 2004-03-23 at 22:44, Jason Stajich wrote:
> > >>
> > >>
> > >>> You need an additional perl module.
> > >>>
> > >>>
> > >>> install IO::String from CPAN
> > >>>
> > >>> There is a section on how to install additional perl modules in the
> > >>> INSTALL document.
> > >>>
> > >>> -j
> > >>>
> > >>> On Tue, 23 Mar 2004, Alberto Davila wrote:
> > >>>
> > >>>
> > >>>
> > >>>> Hi,
> > >>>>
> > >>>> May I ask for some help ?
> > >>>>
> > >>>> I am trying to use the BioFetch module in order to download several
> > > seqs
> > >>>> (from specific Organisms) from GenBank in fasta format, but looks
> > > like I
> > >>>> am missing "IO/String.pm" and other things.. should I install
> > > additional
> > >>>> bioperl modules (I have the Bioperl Core 1.4 installed) ? or use a
> > >>>> different module for my purpose ?
> > >>>>
> > >>>> My script and error msg are listed below.
> > >>>>
> > >>>> Thanks and besr regards,
> > >>>>
> > >>>> Alberto
> > >>>>
> > >>>> ****
> > >>>>
> > >>>> #!/usr/local/bin/perl -w
> > >>>>
> > >>>> use lib "/usr/local/bioperl14";
> > >>>> package Bio::DB::BioFetch;
> > >>>> use strict;
> > >>>> use Bio::DB::WebDBSeqI;
> > >>>> use HTTP::Request::Common 'POST';
> > >>>>
> > >>>> my $format_type='fasta';
> > >>>> my $stream;
> > >>>>
> > >>>>
> > >>>> my $bf = new Bio::DB::BioFetch(-format =>$format_type',
> > >>>> -retrievaltype =>'tempfile',
> > >>>> -db =>'EMBL');
> > >>>>
> > >>>> $stream = $bf->get_Stream_by_id(['BUM','J00231']);
> > >>>> while (my $s = $stream->next_seq) {
> > >>>> print $s->seq,"\n";
> > >>>> }
> > >>>>
> > >>>>
> > >>>> exit;
> > >>>>
> > >>>>
> > >>>> [davila at tryps script]$ perl gb-fetch-1.pl
> > >>>> Can't locate IO/String.pm in @INC (@INC contains:
> > >>>> /usr/local/bioperl14/i386-linux-thread-multi /usr/local/bioperl14
> > >>>> /usr/lib/perl5/5.8.3/i386-linux-thread-multi /usr/lib/perl5/5.8.3
> > >>>> /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
> > >>>> /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
> > >>>> /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
> > >>>> /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
> > >>>> /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2
> > >>>> /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0
> > >>>> /usr/lib/perl5/site_perl
> > >>>> /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
> > >>>> /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
> > >>>> /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
> > >>>> /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
> > >>>> /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2
> > >>>> /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0
> > >>>> /usr/lib/perl5/vendor_perl .) at
> > >>>> /usr/local/bioperl14/Bio/DB/WebDBSeqI.pm line 90.
> > >>>> BEGIN failed--compilation aborted at
> > >>>> /usr/local/bioperl14/Bio/DB/WebDBSeqI.pm line 90.
> > >>>> Compilation failed in require at gb-fetch-1.pl line 6.
> > >>>> BEGIN failed--compilation aborted at gb-fetch-1.pl line 6.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list