[Bioperl-l] One more odd little parsing problem for your list
Hilmar Lapp
hlapp at gnf.org
Sun Jun 1 01:32:47 EDT 2003
Ouch, a space in the locus identifier. Why do they do this to us.
I'm afraid I have to kludge this. We can't afford the genbank parser
dying on Genbank's random pranks. Isn't this fun.
-hilmar
On Saturday, May 31, 2003, at 12:00 PM, Michael Muratet wrote:
> Greetings
>
> I was parsing CDS features in Refseq human (hs.gbff.gz) when it died on
> PSMAL/GCP III (NM_153696). The CDS was 527..1855. The error was from
> Bio::PrimarySeq::subseq 'You have to start positive....sequence
> [527:1855] Total III'. The parser in SeqIO is picking up the length
> from
> the LOCUS line (as I recall) and for this record it sees 'III' and not
> '1992' bp. It seems a lot to ask Bioperl to figure out every possible
> configuration, maybe Genbank needs to have rules about whitespace in
> gene names?
>
> Cheers
>
> Mike
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the Bioperl-l
mailing list