[Bioperl-l] blast report parsing addition(s)
Wiepert, Mathieu
Wiepert.Mathieu@mayo.edu
Mon, 11 Nov 2002 09:01:08 -0600
Hi,
I changed the way accession numbers are parsed from the blast reports. In some cases the locus was actually being grabbed, not the accession number. I added a locus method to HitI, in case anyone wanted the locus.
I also added something called each_accession_number to HitI. This was to get all the accession numbers from the description. I was finding that I needed all the accession numbers to help me categorize things, or organize my hits. Something to parse hits that look like:
>ref|NP_065733.1| (NM_020682) Cyt19 protein; likely ortholog of rat methyltransferase
Cyt19; S-adenosylmethionine:arsenic (III)
methyltransferase [Homo sapiens]
pir||T14789 hypothetical protein DKFZp586L0724.1 - human
emb|CAB53709.1| (AL110271) hypothetical protein [Homo sapiens]
gb|AAG09731.1|AF226730_1 (AF226730) Cyt19 [Homo sapiens]
gb|AAH01726.1|AAH01726 (BC001726) Similar to DKFZP586L0724 protein [Homo sapiens]
They are implemented in GenericHit.
Something like
my $locus = $hit->locus;
my @accnums = $hit->each_accession_number;
foreach my $a (@accnums) {
print "\tHit Accnums: ", $a , "\n";
}
For each_accession_number, the first one in the list is always the accession number returned by $hit->accession().
-Mat