[Bioperl-l] blast report parsing addition(s)

Jason Stajich jason@cgt.mc.duke.edu
Mon, 11 Nov 2002 10:10:31 -0500 (EST)


Nice work Mat! - Will give it a walkthrough this week.

-jason

On Mon, 11 Nov 2002, Wiepert, Mathieu wrote:

> Hi,
>
> I changed the way accession numbers are parsed from the blast reports.  In some cases the locus was actually being grabbed, not the accession number.  I added a locus method to HitI, in case anyone wanted the locus.
>
> I also added something called each_accession_number to HitI.  This was to get all the accession numbers from the description.  I was finding that I needed all the accession numbers to help me categorize things, or organize my hits.  Something to parse hits that look like:
>
> >ref|NP_065733.1| (NM_020682) Cyt19 protein; likely ortholog of rat methyltransferase
>            Cyt19; S-adenosylmethionine:arsenic (III)
>            methyltransferase [Homo sapiens]
>  pir||T14789 hypothetical protein DKFZp586L0724.1 - human
>  emb|CAB53709.1| (AL110271) hypothetical protein [Homo sapiens]
>  gb|AAG09731.1|AF226730_1 (AF226730) Cyt19 [Homo sapiens]
>  gb|AAH01726.1|AAH01726 (BC001726) Similar to DKFZP586L0724 protein [Homo sapiens]
>
> They are implemented in GenericHit.
>
>
> Something like
> my $locus = $hit->locus;
> my @accnums = $hit->each_accession_number;
> foreach my $a (@accnums) {
>   print "\tHit Accnums: ", $a , "\n";
> }
>
> For each_accession_number, the first one in the list is always the accession number returned by $hit->accession().
>
> -Mat
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

-- 
Jason Stajich
Duke University
jason at cgt.mc.duke.edu