[Bioperl-l] script
James Wasmuth
james.wasmuth at ed.ac.uk
Thu Sep 4 12:45:05 EDT 2003
Sorry to be anally retentive, but just in case it is on a Windows
machine, and the fasta file has funny carriage return formatting which
may not picked up with ' \n ' use;
open IN, "<file.da" or die;
while (<IN>) { $lines .= $_; }
$count++ while ($lines=~m/^>/g);
print "No. of seqs: ", $count, "\n";
I think all bases have been covered now, except loading it into BioSeqIO...
:-p
james
Andreas Kahari wrote:
>If it's not a Unix system, this [untested] Perl snippet will do
>approximately the same thing:
>
>$/ = "\n>";
>$count = 0;
>
>open(IN, "file.fa") or die;
>while (<IN>) { $count++ }
>close(IN);
>
>print "No. of seqs: ", $count, "\n";
>
>
>On Thu, Sep 04, 2003 at 05:12:21PM +0100, James Wasmuth wrote:
>
>
>>If its a standard FASTA format file, then at the command line prompt type:
>>
>>grep ">" file.fa | wc -l
>>
>>hth
>>james
>>
>>Lobvi Matamoros wrote:
>>
>>
>>
>>>Hi:
>>>
>>>Does any one have an script to count how many proteins do you have in
>>>a database/file in FASTA format
>>>
>>>Thanks in advance for your help
>>>
>>>
>[cut]
>
>
>
--
Nematode Bioinformatics
Blaxter Nematode Genomics Group
Institute of Cell, Animal and Population Biology
Ashworth Labs
University of Edinburgh
King's Buildings
Edinburgh
EH9 3JT
UK
(+44)(0)131 650 7403
More information about the Bioperl-l
mailing list