[Biojava-l] make feature to create embl or genbank file

Hedwig Kurka kurka at mikro.biologie.tu-muenchen.de
Wed Jun 29 14:39:06 UTC 2011


Hello all,

I have a problem concerning creating EMBL or Genbank files.
Below is a fragment of my code and an example of how the EMBL file looks
like.

       String name = "test genome";
       String seqString = pFasta.getSequence(1, pFasta.getLength());
       Sequence seq = DNATools.createDNASequence(seqString, name);
       Alphabet dna =  AlphabetManager.alphabetForName("DNA");
       RichSequence rs =
Tools.createRichSequence(RichObjectFactory.getDefaultNamespace(), name,
seqString, dna);
       Set<Feature> rfeatSet = new HashSet<Feature>();
       StrandedFeature.Template t = new StrandedFeature.Template();
       for(int i=0; i<annotierten.size(); i++){
                   int start = (int) Math.abs(anno.get(i).getStart());
                   int stop = (int) Math.abs(anno.get(i).getStop());
                   t.type = "CDS";                  
                   if(start < stop){
                       t.location = new RangeLocation(start, stop);
                       t.strand = StrandedFeature.POSITIVE;
                   }
                   if(start > stop){
                       t.location = new RangeLocation(stop, start);
                       t.strand = StrandedFeature.NEGATIVE;
                   }
                   Feature f = seq.createFeature(t);
                   RichFeature rf = RichFeature.Tools.enrich(f);
                   rfeatSet.add(rf);
       }
       rs.setFeatureSet(rfeatSet);
       rs = RichSequence.Tools.enrich(rs);
       RichSequence.IOTools.writeEMBL(output, rs,
RichObjectFactory.getDefaultNamespace());

EMBL file:
FT   any             1889536..1890903
FT   any             134636..136987
FT   any             3727110..3727625
FT   any             2812636..2813517
FT   any             580648..581643
FT   any             2330962..2331921
FT   any             1012371..1013513
FT   any             1260854..1261720
FT   any             1602858..1603706
FT   any             4108079..4108999
FT   any             346637..347731
FT   any             4073395..4074549

I wonder where the information of plus and minus strand is, why is there
"any" in the file and not "CDS" and so on.

As tutorial I found that:
http://www.biojava.org/wiki/BioJava:Cookbook:Locations:Feature. Is there
another one?

Thank you for your help!

And any help is appreciated,

Hedwig




More information about the Biojava-l mailing list