[Bioperl-l] what are the key features of genbank file recognized by SeqIO?

Jason Stajich jason at bioperl.org
Tue Oct 16 23:58:46 UTC 2007


I am sure we should also let the Artemis developers know what is  
going on so they can help fix the bug on their end as well.

Have you also tried dumping EMBL from Artemis, a format I assume they  
are more likely to use, and seeing if that parses fine in BioPerl?

-jason
On Oct 16, 2007, at 4:47 PM, Barry Moore wrote:

> Hi Sheri,
>
> You would need to look at the source code for Bio::SeqIO::genbank.pm
> specifically the next_seq method.  If you let us know what kind of
> errors you're getting, or what data is missing in your seq object
> perhaps we could help you narrow it down.  You could also send one of
> your GenBank style files that isn't working and someone might be
> willing to take a stab at running it.
>
> Barry
>
> On Oct 16, 2007, at 5:15 PM, Sheri Simmons wrote:
>
>> I'm hoping someone can help me with figuring out how to generate
>> genbank files
>> from scratch that can be interpreted with SeqIO. The problem is
>> that genbank
>> files generated via the artemis program are not recognized by
>> SeqIO, so I am
>> attempting to generate SeqIO-compatible genbank files so they can be
>> converted to other formats later.
>> I produced a file which looks by eye exactly like standard genbank
>> files, but
>> which is not recognized by SeqIO. Could anyone tell me or refer me
>> to a
>> source that explains the exact format that SeqIO::genbank requires?
>> FYI here's a snippet of the file generated by my program:
>>
>> ########
>> LOCUS   Contig2406              66712 bp        dna     linear  UNK
>> FEATURES             Location/Qualifiers
>>      CDS                 83..1222
>>                             /gene="org_2406_0001"
>>      CDS                1259..1576
>>                            /gene="org_2406_0002"
>>      CDS                complement(1830..3284)
>>                            /gene="org_2406_0003c"
>> (more CDS)
>> BASE COUNT      14091 a         19796 c         18841 g
>> 13747 t
>> ORIGIN
>>         1 gtcgactctg aggatcccct ccttctgaat agaccaacca tttgaagcta
>> acatacacaa
>>       61 taagaattct attgcacttg agatgcttcg tctagtagat gcattgcctg
>> ccgatatcaa
>> (more sequence)
>> ###########
>>
>> Thanks,
>> Sheri
>>
>> --
>> Sheri Simmons
>> Department of Earth and Planetary Sciences
>> University of California, Berkeley
>> Berkeley, CA 94720-4767
>> http://webfiles.berkeley.edu/~sheris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org




More information about the Bioperl-l mailing list