[Bioperl-l] Problems parsing swiss-prot files
Jessica Dantzer
jdantzer at cs.iupui.edu
Tue Jul 6 15:00:14 EDT 2004
We added both of the files to our current version of Bioperl, and things
seem to be working as they should. Thanks for the help!
Jessica
> I've fixed it in CVS. I also fixed a bunch of other things in swissprot
> parsing to make the parser cleaner I hope. This involved improving the
> 'new' function in Bio::Annotation::Reference so you'd want to get that
> as well if you getting code from CVS.
>
> Multi-line RP lines are now all put into the rp field of the
> Annotation::Reference object. The parser takes care of splitting it
> back into multi-line fields upon writing (although I didn't test this
> case specifically).
>
> PVH and our code auditors. As happy as I am about the code audit for
> SeqIO and the like and making sure that things can roundtrip. I really
> feel like the guts of these parsers could just a few weeks of someone's
> time to clean them up first. Of course myself and few others would want
> to simplify the sequence/annotation/feature object model first so who
> knows what is the best starting point...
>
> -jason
>
> On Fri, 2 Jul 2004, Jessica Dantzer wrote:
>
>> Most of the references in most of the files have only one RP
>> line. Occasionally, there are two. I haven't seen more than two,
>> though. One of the files that had more than one line in at least one
>> reference was for P33897. I'm parsing information on the mutation/
>> variant data and their references, and so need some of the information
>> on those second lines.
>>
>> At 03:55 PM 7/2/2004, Jason Stajich wrote:
>> >Is there more than one RP line per reference? The data structures
>> and parsers currently assume there is only one.
>> >can you send an acc so we can add it to the tests?
>> >
>> >-jason
>> >On Thu, 1 Jul 2004, Jessica Dantzer wrote:
>> >
>> > > I'm working on parsing swiss-prot files for use in another
>> database, and I've managed to work out where all the information I
>> need is stored for the most part. The only problems I'm
>> encountering are with the reference parsing-- Some of the files
>> have multiple "RP" lines, and I only seem to be able to get one.
>> The code seems to indicate that this is how the files are parsed.
>> Is there any other way to access the second line?
>> > >
>> > > Thanks,
>> > > Jessica
>> > >
>> > >
>> > >
>> > >
>> > > _______________________________________________
>> > > Bioperl-l mailing list
>> > > Bioperl-l at portal.open-bio.org
>> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
>> > >
>> >
>> >--
>> >Jason Stajich
>> >Duke University
>> >jason at cgt.mc.duke.edu
>>
>
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list