[Biojava-l] BLAST Parser for extracting all BLAST data?

Sébastien PETIT great_fred at yahoo.com
Tue Jun 28 05:11:12 EDT 2005


Hi, everybody...

I'm like Georges....I want to extract data from BLAST files.....
I can have the alignements, no problem...But, now, I want the alignment
between the 2 sequences (the lines with "+", "-" and some letters in
George's example....) because with this, we can see in a glance if the
alignment between the 2 sequences is really good or not.

Is it possible, Docs??

Thank you.

Sebastien

--- Richard HOLLAND <hollandr at gis.a-star.edu.sg> a écrit :

> BioJava's BLAST framework parses files and fires events for every
> piece of information it finds. The SeqSimilarityAdapter class is an
> example of how to catch these events and construct basic BLAST result
> objects (SimpleSeqSimilarityHit), however they are not comprehensive
> and do not record full details of every hit.
> 
> If you want the kind of detail you mention below you will have to
> write your own content handler for BLAST parsing and parse it to the
> BLASTLikeSAXParser when parsing a file. This event handler should
> implement the ContentHandler interface. Look at the source of
> SeqSimilarityAdapter for guidance. You will then receive events for
> every part of the file, from which you can construct your own custom
> BLAST result objects to describe them.
> 
> If you're not sure what tag names to listen for in your
> ContentHandler the easiest thing to do is just run it once and dump
> them all out to see what you get.
> 
> cheers,
> Richard
> 
> 
> -----Original Message-----
> From:	biojava-l-bounces at portal.open-bio.org on behalf of Y D Sun
> Sent:	Sun 6/26/2005 5:42 PM
> To:	biojava-l at biojava.org
> Cc:	
> Subject:	[Biojava-l] BLAST Parser for extracting all BLAST data?
> 
> Hi,
> 
> I want to extract all data from BLASTP results. In the following hit,
> for example, I need to get the lengths of query and subject proteins,
> the identities (including all data 54, 124 and 43%), the positives
> (all
> data 79, 124 and 63%), and the gaps (3, 124 and 2%). Can the
> BLASTLikeSAXParser filter all these information? I can't find the
> methods in SeqSimilaritySearchHit and SeqSimilaritySearchSubHit APIs
> to
> retrieve these data. Does Biojava provide any methods for this
> purpose?
> 
> Thanks,
> 
> George
> 
> 
> BLASTP 2.2.5 [Nov-16-2002]
> 
> Query= Prot0001
>          (138 letters)
> 
> Database: /work/nys1/fasta/protein/AE000782.pro.fasta
>            2407 sequences; 662,866 total letters
> 
> Searching.....done
> 
>                                                                 
> Score
> E
> Sequences producing significant alignments:                     
> (bits)
> Value
> 
> Prot0002                                                          
> 100
> 1e-23
> Prot0003                                                           
> 74
> 2e-15
> Prot0004                                                           
> 43
> 3e-06
> 
> >Prot0002
>           Length = 138
> 
>  Score =  100 bits (250), Expect = 1e-23
>  Identities = 54/124 (43%), Positives = 79/124 (63%), Gaps = 3/124
> (2%)
> 
> Query: 18 
> NARTKFTDIAKTLNLTEAAIRKRIKKLEENQIIKRYSIDIDYKKLGYNMAIIGLDIDMDY
> 77
>            NAR   T IAK LN+TEAA+RKRI  LE  + I  Y   I+YKK+G + ++ G+D+D
> D
> Sbjct: 15 
> NARIPKTRIAKELNVTEAAVRKRIANLERREEILGYKAIINYKKVGLSASLTGVDVDPDK
> 74
> 
> Query: 78 
> FPKIIKELEKRKEFLHIYSSAGDHDIMVIAIYK---DLEEIYNYLKNLKGVKRVCPAIII
> 134
>              K+++EL+  +    ++ + GDH IM   I K   +L EI+  + 
> ++GVKRVCP+II
> Sbjct: 75 
> LWKVVEELKDLESVKSLWLTTGDHTIMAEIIAKSVQELSEIHQKIAEMEGVKRVCPSIIT
> 134
> 
> Query: 135 DQIK 138
>            D +K
> Sbjct: 135 DIVK 138
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 
> 
> 
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 



	

	
		
___________________________________________________________________________ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
Téléchargez cette version sur http://fr.messenger.yahoo.com


More information about the Biojava-l mailing list