[Biojava-l] Filtering mRNA from GenBank format files

Schreiber, Mark mark.schreiber at agresearch.co.nz
Mon Mar 10 16:52:37 EST 2003


Hi -

If you want to filter while reading sequence in you could write
something that overides SeqIOFilter and implements ParseErrorSource and
only passes on events from GenBank sequences of interest. This is a bit
more complicated and requires a more detailed understanding of the SeqIO
API but is ultimately more efficient if you are doing this on a large
scale.

- Mark


> -----Original Message-----
> From: Matthew Pocock [mailto:matthew_pocock at yahoo.co.uk] 
> Sent: Saturday, 8 March 2003 1:12 a.m.
> To: Takeshi Sasayama
> Cc: biojava-l at biojava.org
> Subject: Re: [Biojava-l] Filtering mRNA from GenBank format files
> 
> 
> Hi Takeshi,
> 
> The genbank (and I presume embl) parser sets a property "TYPE" in the
> sequence's annotation. So, doing something like this would work:
> 
> (untested code)
> 
> public static void main(String[] args)
> throws Exception {
>   for(int i = 0; i < args.length; i++) {
>     SequenceIterator si = SeqIOTools.readGenbank(
>       new BufferedReader(
>         new FileReader(
>           new File(args[i]) )));
>     while(si.hasNext()) {
>       Sequence seq = si.next();
>       Annotation ann = seq.getAnnotation();
>       if(ann.hasProperty("TYPE")) {
>         Object type = ann.getProperty("TYPE");
>         if("DNA".equals(type)) {
>           // do something with DNA entries
>         } else if("mRNA".equals(type)) {
>           // do something with RNA entries
>         } ...
>       }
>   }
> }
> 
> Takeshi Sasayama wrote:
> > Hi,
> > 
> > I'm a newbee in Biojava and I have a question.
> > I would like to make a filter which reads from multiple 
> GenBank format files
> > (gbpri1.seq, gbpri2.seq, ..., gbpri24.seq) and writes only 
> entries which
> > have molecule type of "mRNA"(I mean molecule type is "DNA", 
> "mRNA" etc.
> > which is written just after sequence length in LOCUS line.) 
> to a file.
> > 
> > Could anyone show me overview of this flow? I saw "Biojava 
> in Anger" web
> > page but I don't know how to do that using Biojava.
> > Also any suggestions, information are welcome.
> > Thanks
> > 
> > Takeshi Sasayama
> > 
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l at biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> > 
> 
> 
> -- 
> BioJava Consulting LTD - Support and training for BioJava
> http://www.biojava.co.uk
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================



More information about the Biojava-l mailing list