[Bioperl-l] Question: How to manipulate files

Marco Blanchette mblanche at berkeley.edu
Thu Mar 30 01:20:22 UTC 2006


Michael--

Something like:

#!/usr/bin/perl
use Bio::SeqIO;

my $file = shift;

my $seqio_o = Bio::SeqIO->new(-file => $file);

while ($seq_o =$seqio_o->next_seq){
    my ($id) = $seq_o->display_id =~ /_(\d*)$/;
     print ">", $seq_o->display_id, "\n",  $seq_o->seq, "\n", if $id >= 7;
}

If you redirect the standard output, this script would do what you try to
achieve.

Just call:
$perl theScript.pl myfile.fasta > myNewFile.fasta


On 3/29/06 14:41, "Michael Craige" <mcraige at genetics.emory.edu> wrote:

> I am attempting to develop a script to open a DNA file contain 15 FASTA
> sequences and then delete the first 7 sequences and close the file leaving
> the remainder 8 sequences intact.
> 
> Can someone help me with a Perl script or point me to some doc that can
> help? Here is a sample, the first sequence in the file header is show below.
> All the header is the same except for the number "001 to 015"
> 
> 
>> 10kb_NN_Analysis.txt.nmrc_001
> NTNTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTNNNNNNNN
> AANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
> NNNNNNNNNNNNNNNNNNNNNNNN
> 
> I trying to get the script to find the first sequences ".nmrc_001" and then
> delete files content to the end of file ".nmrc_007" without affect the
> header with ".nmrc_008"
> 
> Is there something already exist to do this?
> 
> 
> Michael Craige
> Emory University
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

______________________________
Marco Blanchette, Ph.D.

mblanche at uclink.berkeley.edu

Donald C. Rio's lab
Department of Molecular and Cell Biology
16 Barker Hall
University of California
Berkeley, CA 94720-3204

Tel: (510) 642-1084
Cell: (510) 847-0996
Fax: (510) 642-6062
-- 






More information about the Bioperl-l mailing list