[Bioperl-l] SeqIO alters Genbank files

Brian Osborne bosborne11 at verizon.net
Thu Aug 25 14:35:29 UTC 2011


bioperl-l,

I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file:

/score=100.1

And adding a "note" tag, so the output file contains this:

/score=100.1
/note="score=100.1"

I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. 

On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this:

/score=100.1
/note="score=100.1"
/note="score=100.1"
/note="score=100.1"
/note="score=100.1"

Should I comment out the code that's doing these edits or not?

Thanks again,

Brian O.






More information about the Bioperl-l mailing list