[Bioperl-l] blast report parsing addition(s)
Jason Stajich
jason@cgt.mc.duke.edu
Mon, 11 Nov 2002 10:10:31 -0500 (EST)
Nice work Mat! - Will give it a walkthrough this week.
-jason
On Mon, 11 Nov 2002, Wiepert, Mathieu wrote:
> Hi,
>
> I changed the way accession numbers are parsed from the blast reports. In some cases the locus was actually being grabbed, not the accession number. I added a locus method to HitI, in case anyone wanted the locus.
>
> I also added something called each_accession_number to HitI. This was to get all the accession numbers from the description. I was finding that I needed all the accession numbers to help me categorize things, or organize my hits. Something to parse hits that look like:
>
> >ref|NP_065733.1| (NM_020682) Cyt19 protein; likely ortholog of rat methyltransferase
> Cyt19; S-adenosylmethionine:arsenic (III)
> methyltransferase [Homo sapiens]
> pir||T14789 hypothetical protein DKFZp586L0724.1 - human
> emb|CAB53709.1| (AL110271) hypothetical protein [Homo sapiens]
> gb|AAG09731.1|AF226730_1 (AF226730) Cyt19 [Homo sapiens]
> gb|AAH01726.1|AAH01726 (BC001726) Similar to DKFZP586L0724 protein [Homo sapiens]
>
> They are implemented in GenericHit.
>
>
> Something like
> my $locus = $hit->locus;
> my @accnums = $hit->each_accession_number;
> foreach my $a (@accnums) {
> print "\tHit Accnums: ", $a , "\n";
> }
>
> For each_accession_number, the first one in the list is always the accession number returned by $hit->accession().
>
> -Mat
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu