[Bioperl-l] BIO::DB::FASTA ID
Jason Stajich
jason at bioperl.org
Thu Jun 21 21:19:14 UTC 2007
Hey Nick -
I think
a) your IDs are not unique
b) you need to declare the function make_my_id BEFORE your call
Bio::DB::Fasta->new if you want your function to be used.
$ grep "^>" T_orthologs_Dpse_genes.fa | awk '{print $1}' | sort |
uniq | wc -l
1527
-jason
On Jun 21, 2007, at 11:36 AM, Staffa, Nick (NIH/NIEHS) wrote:
> #!/usr/bin/perl
> #
> #
> #
> use strict;
> use Bio::DB::Fasta;
> use Bio::Tools::SeqWords;
> use Bio::Seq;
> use Bio::SeqIO;
> $|=1;
> #
> #
> my $Dpse_UTR_file_for_T_orthologs =
> "/home/staffa/clients/Kari/D_pse_genome/testit/
> T_orthologs_Dpse_genes.fa";
> my $db = Bio::DB::Fasta->new
> ('/home/staffa/clients/Kari/D_pse_genome/testit/
> T_orthologs_Dpse_genes.fa',
> -reindex, -makeid => \&make_my_id);
> my @ids = $db->ids;
> my $number_in = @ids;
> print "number of Dpse IDs = $number_in\n";
> foreach my $id (@ids){
> print "$id\n";
> }
> sub make_my_id {
> # parse header line:
> # >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0
> TTATTTATT
> my $line = shift;
> # print "line = $line\n";
> $line =~ />(\w+) /;
> my $ID = $1;
> # print "ID = $ID\n";
> return $ID;
> }
--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/
More information about the Bioperl-l
mailing list