[Bioperl-l] use Bio::SeqIO to read Fasta sequence from pipe, or @ARGV, like "while (<>) {....}"
Haiyan Lin
linhy0120 at gmail.com
Sun Jul 13 01:56:52 UTC 2014
Thanks to Mark and Torsten for their advice.
I can got what I want using both "less" and "cat" by writing data form
upstream pipeline into a temporary file, followed by processing and
removing. But, it is not efficient. I'm looking for a better method.
Whether can I avoiding writting and removinf the file? Thanks for any
advic
##### Following is MyData
[linhy at bioinfo1 Script]$ more try.fa
>Contig000001
CCACGTAAGAGCACCTGGGTCCCCGCCCGCCAAGCGCCGCGAGCGCCAGCAGCAGCTCGC
>hello
ATATATTTTT
##### Following is my code
[linhy at bioinfo1 Script]$ more fastaLen.pl
#!/usr/bin/env perl
use strict;
use warnings;
use utf8;
use Bio::SeqIO ;
use Getopt::Long;
use Statistics::Descriptive ;
my %opt = () ;
GetOptions(
"sum"=>\$opt{sum},
);
my $tf = "/tmp/subFasta.len.PID$$.fa";
open WTF,">$tf" or die $!;
while(<>){
print WTF ;
}
close WTF;
my $sta = Statistics::Descriptive::Full->new();
my $in = Bio::SeqIO->new(-format=>"Fasta",-file=>$tf);
#### I'm failed with the following commented line
#my $in = Bio::SeqIO->new(-format=>"Fasta",-fh=>\*DATA);
while(my $s = $in->next_seq()){
$sta->add_data($s->length()) ;
}
`rm $tf` ;
if ( $opt{sum} ) {
print $sta->sum(), "\n" ;
}
exit;
##### The output by above code and data using by both "cat" and "less"
[linhy at bioinfo1 Script]$ cat try.fa | ./fastaLen.pl -s
70
[linhy at bioinfo1 Script]$ less try.fa | ./fastaLen.pl -s
70
On Sat, 2014-07-12 at 20:57 -0400, Mark A Jensen wrote:
> You should do
> cat try.fa | perl ..
> rather than less,IMO. less formats things.
>
> You can add
> no warnings qw/once/;
> to get rid of that warning, it shouldn't affect
> the program.
>
> MAJ
>
>
> On Sat, Jul 12, 2014 at 8:03 PM, Haiyan Lin <linhy0120 at gmail.com>
> wrote:
>
>
> Hi, Paul,
>
> Thanks for your quick replay and advice. Soryy for my late
> feedback, I can't send maill in my location yesterday. I have
> tried your method. But
> it complains that
>
> ------------------------------------
> [linhy at bioinfo1 Script]$ less try.fa | perl ./fastaLen.pl
> Name "main::DATA" used only once: possible typo
> at ./fastaLen.pl line
> 42.
> Use of uninitialized value in print at ./fastaLen.pl line 49.
> ------------------------------------
>
>
> I had also tried the "-fh=>\*STDIN", and "-fh=><>",according
> to the
> documentation returned by "perldoc Bio::SeqIO" at termminal,
> but I still
> haven't got I want.Thanks again.
>
> ... ...
>
> Bio::SeqIO->new()
> $seqIO = Bio::SeqIO->new(-file => 'filename',
> -format=>$format);
> $seqIO = Bio::SeqIO->new(-fh => \*FILEHANDLE,
> -format=>$format);
> $seqIO = Bio::SeqIO->new(-format => $format);
>
> ... ...
>
> -fh You may provide new() with a previously-opened
> filehandle.
> For example, to read from STDIN:
>
> $seqIO = Bio::SeqIO->new(-fh => \*STDIN);
>
> Note that you must pass filehandles as references
> to globs.
>
> If neither a filehandle nor a filename is
> specified, then
> the module will read from the @ARGV array or STDIN, using the
> familiar <> semantics.
>
> -------------------------------
>
>
>
> Regards
>
> Haiyan
>
>
>
>
> On Sat, 2014-07-12 at 10:22 -0400, Paul Cantalupo wrote:
> > Hi Haiyan,
> >
> >
> > You need to use the '-fh' option in Bio::SeqIO new and have
> it use
> > Perl's DATA filehandle like so:
> > my $in = Bio::SeqIO->new(-fh => \*DATA, -format => 'fasta');
> >
> > This was taken from
> > http://perldoc.perl.org/perldata.html#Special-Literals:
> > "Text after __DATA__ may be read via the filehandle
> PACKNAME::DATA ,
> > where PACKNAME is the package that was current when the
> __DATA__ token
> > was encountered."
> >
> >
> >
> > Good luck,
> >
> > Paul
> >
> >
> >
> >
> > Paul Cantalupo
> > University of Pittsburgh
> >
> >
> >
> > On Sat, Jul 12, 2014 at 8:35 AM, Haiyan Lin
> <linhy0120 at gmail.com>
> > wrote:
> > Hill, dear perlers,
> >
> > I‘m trying to use Bio::SeqIO to read Fasta sequence
> from pipe,
> > or @ARGV,
> > like "while (<>) {....}". After several trier and
> error, I'm
> > failed and
> > need to ask for herp from you. Could you please help
> me to
> > check or try
> > following code?
> >
> > Thanks in advance.
> >
> > ---------------------------------------
> > use Bio::SeqIO ;
> > use Statistics::Descriptive ;
> >
> > my %opt = () ;
> > my $sta = Statistics::Descriptive::Full->new();
> >
> > ##### here is the key, I think.
> > my $in = Bio::SeqIO->new(-format=>"Fasta");
> > while(my $s = $in->next_seq()){
> > $sta->add_data($s->length()) ;
> > }
> > print $sta->sum() if $opt{sum} ;
> >
> > __DATA__
> > >ct1
> > AGAGAGAGA
> > >ctg2
> > ATATATAT
> > -----------------------------------------------
> >
> > Regards
> >
> > Haiyan
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at mailman.open-bio.org
> >
> http://mailman.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list