[Bioperl-l] LocusLink IO
Paul Boutros
pcboutro@engmail.uwaterloo.ca
Mon, 2 Dec 2002 13:50:41 -0500 (EST)
I followed the suggestion (I think from Allan Day) of extracting &
diff'ing record 27 from the file. This is what I got:
===========================
pcboutro@engmail[5] diff testLL.txt LL-sample.seq | more
43a44
> UNIGENE: Hs.75741
46d46
< UNIGENE: Hs.75741
52a53,54
> BUTTON: homol.gif
> LINK:
http://www.ncbi.nlm.nih.gov/HomoloGene/homolquery.cgi?TEXT=26[loc]&TAXID
=9606
133a136
> UNIGENE: Hs.121521
136d138
< UNIGENE: Hs.121521
144a147,148
> BUTTON: homol.gif
> LINK:
http://www.ncbi.nlm.nih.gov/HomoloGene/homolquery.cgi?TEXT=27[loc]&TAXID
=9606
==========================
The file does indeed terminate with a >> but I didn't see any empty lines
after that. I'll submit this as a bug report along with everything I've
tested so far.
Paul
On Mon, 2 Dec 2002, Hilmar Lapp wrote:
> Maybe an end of file (recognition-) problem. Could be pretty simple.
> If you visit the end of your offending input file, are there strange
> things or excessive empty lines? Does it terminate with a record
> delimiter (>>)?
>
> I may not get a chance to investigate this before Wednesday. Can you
> submit it as a bug report to make sure it's in the queue?
>
> -hilmar
>
> On Friday, November 29, 2002, at 02:52 PM, Paul Boutros wrote:
>
> > Hi again,
> >
> > I don't encounter any problems parsing the test file:
> > t\data\LL-sample.seq
> >
> > If I run the LocusLink test
> > c:\perl\bioperl-live> perl -w t\LocusLink.t
> > 1..23
> > ok 1
> > ok 2
> > Use of uninitialized value in pattern match (m//) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 384, <GEN0> line 2.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 600, <GEN0> line 2.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 603, <GEN0> line 2.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 604, <GEN0> line 2.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 605, <GEN0> line 2.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 606, <GEN0> line 2.
> > Use of uninitialized value in length at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm
> > line 618, <GEN0> line 2.
> > Use of uninitialized value in substr at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm
> > line 633, <GEN0> line 2.
> > Use of uninitialized value in pattern match (m//) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 384, <GEN0> line 3.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 600, <GEN0> line 3.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 603, <GEN0> line 3.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 604, <GEN0> line 3.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 605, <GEN0> line 3.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 606, <GEN0> line 3.
> > Use of uninitialized value in length at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm
> > line 618, <GEN0> line 3.
> > Use of uninitialized value in substr at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm
> > line 633, <GEN0> line 3.
> > ok 3
> >
> > and okay through the rest of the tests.
> >
> > Visually the two files look very similar, and there are no obvious
> > formatting differences. And it does take quite a few seconds of
> > running
> > before the two "Deep Recursion" warnings come up, then a few more
> > before I
> > get the exception.
> >
> > When I run:
> >
> > use Bio::SeqIO;
> > use strict;
> > my $file = $ARGV[0];
> > my $seqio = Bio::SeqIO->new(
> > -format => 'locuslink',
> > -file => $file
> > );
> >
> > while (my $seq = $seqio->next_seq()) {
> > my $acc = $seq->annotation();
> > print $seq->accession(), "\n";
> > }
> >
> > The two deep recursion warnings come as:
> >
> > 15601
> > Deep recursion on subroutine "Bio::SeqIO::locuslink::next_seq" at
> > C:/Perl/site/lib/Bio\SeqIO\locuslink.pm line 456, <GEN0> chunk 15465.
> > 15731
> > Deep recursion on subroutine "Bio::SeqIO::locuslink::next_seq" at
> > C:/Perl/site/lib/Bio\SeqIO\locuslink.pm line 456, <GEN0> chunk 15595.
> > 15874
> > 15890
> >
> > And the exception is thrown as:
> > 24785
> > 24786
> > 24787
> >
> > ------------- EXCEPTION -------------
> > MSG: No LOCUSID in first line of record. Not LocusLink in my book.
> > STACK Bio::SeqIO::locuslink::next_seq
> > C:/Perl/site/lib/Bio\SeqIO\locuslink.pm:435
> > STACK toplevel testLL.pl:11
> >
> > --------------------------------------
> >
> > On Fri, 29 Nov 2002, Hilmar Lapp wrote:
> >
> >> I will check what's happening. There is a test case and a sample
> >> file in the repository; while you're at it, do you see what the
> >> fundamental difference is between the sample and the your input
> >> file? (Or does the test fail as well for you?)
> >>
> >> -hilmar
> >>
> >>> -----Original Message-----
> >>> From: Paul Boutros [mailto:pcboutro@engmail.uwaterloo.ca]
> >>> Sent: Friday, November 29, 2002 9:21 AM
> >>> To: Hilmar Lapp
> >>> Cc: bioperl-l@bioperl.org
> >>> Subject: Re: [Bioperl-l] LocusLink IO
> >>>
> >>>
> >>> I tried again on today's (11/29/2002) LL_tmpl file and same error:
> >>>
> >>> C:\paul\dev\LocusLink>perl -w testLL.pl
> >>> Deep recursion on subroutine "Bio::SeqIO::locuslink::next_seq" at
> >>> C:/Perl/site/lib/Bio\SeqIO\locuslink.pm line 456, <GEN0> chunk 15465.
> >>> Deep recursion on subroutine "Bio::SeqIO::locuslink::next_seq" at
> >>> C:/Perl/site/lib/Bio\SeqIO\locuslink.pm line 456, <GEN0> chunk 15595.
> >>>
> >>> ------------- EXCEPTION -------------
> >>> MSG: No LOCUSID in first line of record. Not LocusLink in my book.
> >>> STACK Bio::SeqIO::locuslink::next_seq
> >>> C:/Perl/site/lib/Bio\SeqIO\locuslink.pm:435
> >>> STACK toplevel testLL.pl:8
> >>>
> >>> --------------------------------------
> >>>
> >>>
> >>> On Thu, 28 Nov 2002, Hilmar Lapp wrote:
> >>>
> >>>> The input file needs to be the LL_tmpl or in that format. Does your
> >>>> input file satisfy this? (NCBI releases several files for LL. Many
> >>>> are in tab-format; the LL_tmpl format is a tagged-line format.)
> >>>>
> >>>> -hilmar
> >>>>
> >>>> On Thursday, November 28, 2002, at 03:05 PM, Paul Boutros wrote:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> I'm using the LocusLink SeqIO parser with a download of
> >>> LocusLink from
> >>>>> NCBI today (LL3_021128.txt). When I just parse through
> >>> the file doing
> >>>>> nothing except for checking the organism annotation with:
> >>>>>
> >>>>> use Bio::SeqIO;
> >>>>>
> >>>>> my $seqio = Bio::SeqIO->new(
> >>>>> -format => 'locuslink',
> >>>>> -file => 'LL3_021128.txt'
> >>>>> );
> >>>>>
> >>>>> while (my $acc = $seqio->next_seq()->annotation()) {
> >>>>> if ($acc->get_Annotations('ORGANISM') =~ /rattus norvegicus/i) {
> >>>>> print "Rat!\n";
> >>>>> }
> >>>>> }
> >>>>>
> >>>>> I get:
> >>>>>
> >>>>> Deep recursion on subroutine "Bio::SeqIO::locuslink::next_seq" at
> >>>>> C:/Perl/site/lib/Bio\SeqIO\locuslink.pm line 456, <GEN0>
> >>> chunk 15465.
> >>>>> Deep recursion on subroutine "Bio::SeqIO::locuslink::next_seq" at
> >>>>> C:/Perl/site/lib/Bio\SeqIO\locuslink.pm line 456, <GEN0>
> >>> chunk 15595.
> >>>>>
> >>>>>
> >>>>> ------------- EXCEPTION -------------
> >>>>> MSG: No LOCUSID in first line of record. Not LocusLink in my book.
> >>>>> STACK Bio::SeqIO::locuslink::next_seq
> >>>>> C:/Perl/site/lib/Bio\SeqIO\locuslink.pm:435
> >>>>> STACK toplevel testLL.pl:10
> >>>>> --------------------------------------
> >>>>>
> >>>>> Interestingly enough I also don't get any output from the
> >>> ORGANISM
> >>>>> check,
> >>>>> so I must be doing that wrong, too. I notice that the thing
> >>>>> processes a
> >>>>> fair chunk of time before spitting out the two "Deep recursion"
> >>>>> warnings,
> >>>>> and then a fair bit longer before hitting the exception.
> >>>>>
> >>>>> Any ideas if I'm doing something unusual, or if maybe I should
> >>>>> submit this
> >>>>> as a bug report?
> >>>>>
> >>>>> Paul
> >>>>>
> >>>>> OS: Win XP SP 1 and Win2K SP2
> >>>>> Perl: 5.8.0 and 5.6.1
> >>>>> BioPerl: CVS yesterday
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioperl-l mailing list
> >>>>> Bioperl-l@bioperl.org
> >>>>> http://bioperl.org/mailman/listinfo/bioperl-l
> >>>>>
> >>>> --
> >>>> -------------------------------------------------------------
> >>>> Hilmar Lapp email: lapp at gnf.org
> >>>> GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
> >>>> -------------------------------------------------------------
> >>>>
> >>>
> >>>
> >>
> >
> >
> >
> --
> -------------------------------------------------------------
> Hilmar Lapp email: lapp at gnf.org
> GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
> -------------------------------------------------------------
>