[Biojava-l] Fasta File

Richard HOLLAND hollandr at gis.a-star.edu.sg
Mon May 2 22:08:08 EDT 2005


Have a look at Mark's Biojava in Anger book:

http://www.biojava.org/docs/bj_in_anger/

Particularly this bit:

http://www.biojava.org/docs/bj_in_anger/ReadFasta.htm

This second link has two examples. One reads a fasta file and stores it
as a SequenceDB object, the other stores it as a SequenceIterator
object. You can get a SequenceIterator object from a SequenceDB by
calling db.getSequenceIterator() (assuming your SequenceDB object is
called db).

Once you have a SequenceIterator you use it like this (assuming the
instance of this object is called si) to iterate over the sequences in
the file:

	while (si.hasNext()) {
		Sequence s = (Sequence)s.next();
		// Do stuff with each sequence s here
	}

The Sequence object has methods for getting the name, sequence
(represented as a SymbolList object) etc. - look it up in the BioJava
Javadocs at http://www.biojava.org/docs/api/

cheers,
Richard

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the
intended recipient, please delete it and notify us immediately. Please
do not copy or use it for any purpose, or disclose its content to any
other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biojava-l-bounces at portal.open-bio.org 
> [mailto:biojava-l-bounces at portal.open-bio.org] On Behalf Of 
> Andrea Girardi
> Sent: Monday, May 02, 2005 11:15 PM
> To: biojava-l at biojava.org
> Subject: [Biojava-l] Fasta File
> 
> 
> Hi to all
> 
> i'm new on this newsletter. I've a problem with Fasta file. I've 
> downloaded today the BioJava libs and I'd like to know if 
> there are some 
> metods to open a fasta file like this:
> 
>  >gi|5713315|ref|NP_002060.1| (NM_002069) guanine nucleotide binding 
> protein (G protein), alpha inhibiting activity polypeptide 1 [Homo 
> sapiens]gi|12733317|ref|XP_011603.1| (XM_011603) 
> hypothetical protein 
> XP_011603 [Homo sapiens]gi|14749912|ref|XP_034149.1| (XM_034149) 
> guanine nucleotide binding protein (G protein), alpha inhibiting 
> activity polypeptide 1 [Homo sapiens]gi|121019|sp|P04898|GBI1_HUMAN 
> GUANINE NUCLEOTIDE-BINDING PROTEIN G(I), ALPHA-1 SUBUNIT (ADENYLATE 
> CYCLASE-INHIBITING G ALPHA PROTEIN)gi|71892|pir||RGBOI1 GTP-binding 
> regulatory protein Gi alpha-1 chain (adenylate cyclase-inhibiting) - 
> bovinegi|2144867|pir||RGHUI1 GTP-binding regulatory protein 
> Gi alpha-1 
> chain (adenylate cyclase-inhibiting) - humangi|391|emb|CAA27288.1| 
> (X03642) alpha-subunit [Bos taurus]gi|3005737|gb|AAC09361.1| 
> (AF055013) 
> guanine nucleotide-binding protein alpha-i subunit [Homo 
> sapiens]gi|224920|prf||1204197A protein Gi alpha [Bos taurus]
> MGCTLSAEDKAAVERSKMIDRNLREDGEKAAREVKLLLLGAGESGKSTIVKQMKIIHEAGYS
> EEECKQYKAVVYSNTIQS
> IIAIIRAMGRLKIDFGDSARADDARQLFVLAGAAEEGFMTAELAGVIKRLWKDSGVQACFNR
> SREYQLNDSAAYYLNDLD
> RIAQPNYIPTQQDVLRTRVKTTGIVETHFTFKDLHFKMFDVGGQRSERKKWIHCFEGVTAII
> FCVALSDYDLVLAEDEEM
> NRMHESMKLFDSICNNKWFTDTSIILFLNKKDLFEEKIKKSPLTICYPEYAGSNTYEEAAAY
> IQCQFEDLNKRKDTKEIY
> THFTCATDTKNVQFVFDAVTDVIIKNNLKDCGLF
> 
> I've to compare this string with a string in a XML file like this.
> 
> <?xml version="1.0"?>
> <sequestresults xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
> xsi:noNamespaceSchemaLocation="schema1.xsd">
> <origfilename></origfilename>
> <origfilepath>D:\sequest\Emoclot_1\Fattore_Ottavo_06_E5\03\</o
> rigfilepath>
> <bioworksinfo>3.1 SR1</bioworksinfo>
> <protein>
>     <score>1070.4</score>
>     <accession>4507907</accession>
>     <peptide>
>         <file>Fattore_Ottavo_E5_02_001,493-495</file>
>         <sequence>R.ILAGPAGDSNVVK.L</sequence>
>         <mass>1241.419</mass>
>         <charge>2</charge>
>         <xcorr>2.947</xcorr>
>         <deltacn>0.048</deltacn>
>         <sp>865.5</sp>
>         <rsp>1</rsp>
>         <ions>16/24</ions>
>         <count>1</count>
>     </peptide>
>         <xpress></xpress>
>     </peptide>
> 
> Someone can help me?
> Thanks,
> 
> Andrea
> University of Verona, Italy
> 
> 
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 



More information about the Biojava-l mailing list