[Biojava-l] blast parsing question

Andreas Prlic andreas at sdsc.edu
Thu Aug 6 20:36:02 UTC 2009


Hi Bernd,

not sure if you got a reply to your mail off-list. Did you manage to solve
your problem in the meanwhile?

Andreas

On Wed, Jul 29, 2009 at 6:16 AM, Bernd Jagla <bernd.jagla at pasteur.fr> wrote:

> Hi,
>
>
>
> I am new to BioJava. I want to test what is going on here in order to
> potentially integrate it with KNIME.
>
> My first project is parsing BLAST output for large files. The example in
> the
> codebook is very good and I had no problems integrating everything in
> Eclipse and geting it to work.
>
>
>
> Now here is my problem:
>
> I am interested in parsing the summary table in the beginning of the
> blast-output, and I haven't found a way to get at this information.
>
>
>
> I am blasting short sequences (20nt - 300nt) against genomic databases
> (mouse/human/refseq/miRBase). I want to know if a given sequence (out of a
> set of sequences) aligns to a specific genome with high identity. I want to
> then separate the input source fasta file into a set that aligns to the
> genome and one that doesn't (potentially another list of dubious sequences
> where there is no clear answer). For this I only need the length of the
> query sequence and score and the first few characters of the header line.
>
> At least that's the way I am currently doing it. I have set the blast
> parameters to only give me the first alignment, but the first 50 or so in
> the summary.
>
>
>
> Any help, comments are appreciated.
>
>
>
> Thanks,
>
>
>
> Bernd
>
>
>
>
>
>
>
>
>
>
>
> Bernd Jagla
> Bioinformatician
>
> Institut Pasteur
> Plate-forme puces a ADN
> Genopole / Institut Pasteur
> 28 rue du Docteur Roux
> 75724 Paris Cedex 15
> France
>
>
>  <mailto:bernd.jagla at pasteur.fr> bernd.jagla at pasteur.fr
>
>
> tel:
>
>
> <
> http://www.plaxo.com/click_to_call?lang=en&src=jj_signature&To=%2B33+%280%2
> 9+140+61+35+13&Email=berndjagla at yahoo.com<http://www.plaxo.com/click_to_call?lang=en&src=jj_signature&To=%2B33+%280%2%0A9+140+61+35+13&Email=berndjagla@yahoo.com>>
> +33 (0) 140 61 35 13
>
>
>
>
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>



More information about the Biojava-l mailing list