[Bioperl-l] LocusLink IO

Paul Boutros pcboutro@engmail.uwaterloo.ca
Mon, 2 Dec 2002 13:50:41 -0500 (EST)


I followed the suggestion (I think from Allan Day) of extracting &
diff'ing record 27 from the file.  This is what I got:

===========================
pcboutro@engmail[5] diff testLL.txt LL-sample.seq | more
43a44
> UNIGENE: Hs.75741
46d46
< UNIGENE: Hs.75741
52a53,54
> BUTTON: homol.gif
> LINK:
http://www.ncbi.nlm.nih.gov/HomoloGene/homolquery.cgi?TEXT=26[loc]&TAXID
=9606
133a136
> UNIGENE: Hs.121521
136d138
< UNIGENE: Hs.121521
144a147,148
> BUTTON: homol.gif
> LINK:
http://www.ncbi.nlm.nih.gov/HomoloGene/homolquery.cgi?TEXT=27[loc]&TAXID
=9606
==========================

The file does indeed terminate with a >> but I didn't see any empty lines
after that.  I'll submit this as a bug report along with everything I've
tested so far.

Paul

On Mon, 2 Dec 2002, Hilmar Lapp wrote:

> Maybe an end of file (recognition-) problem. Could be pretty simple. 
> If you visit the end of your offending input file, are there strange 
> things or excessive empty lines? Does it terminate with a record 
> delimiter (>>)?
> 
> I may not get a chance to investigate this before Wednesday. Can you 
> submit it as a bug report to make sure it's in the queue?
> 
> 	-hilmar
> 
> On Friday, November 29, 2002, at 02:52 PM, Paul Boutros wrote:
> 
> > Hi again,
> >
> > I don't encounter any problems parsing the test file:
> > t\data\LL-sample.seq
> >
> > If I run the LocusLink test
> > c:\perl\bioperl-live> perl -w t\LocusLink.t
> > 1..23
> > ok 1
> > ok 2
> > Use of uninitialized value in pattern match (m//) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 384, <GEN0> line 2.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 600, <GEN0> line 2.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 603, <GEN0> line 2.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 604, <GEN0> line 2.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 605, <GEN0> line 2.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 606, <GEN0> line 2.
> > Use of uninitialized value in length at 
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm
> > line 618, <GEN0> line 2.
> > Use of uninitialized value in substr at 
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm
> > line 633, <GEN0> line 2.
> > Use of uninitialized value in pattern match (m//) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 384, <GEN0> line 3.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 600, <GEN0> line 3.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 603, <GEN0> line 3.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 604, <GEN0> line 3.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 605, <GEN0> line 3.
> > Use of uninitialized value in transliteration (tr///) at
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm line 606, <GEN0> line 3.
> > Use of uninitialized value in length at 
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm
> > line 618, <GEN0> line 3.
> > Use of uninitialized value in substr at 
> > C:/Perl/site/lib/Bio\SeqIO\embl.pm
> > line 633, <GEN0> line 3.
> > ok 3
> >
> > and okay through the rest of the tests.
> >
> > Visually the two files look very similar, and there are no obvious
> > formatting differences.  And it does take quite a few seconds of 
> > running
> > before the two "Deep Recursion" warnings come up, then a few more 
> > before I
> > get the exception.
> >
> > When I run:
> >
> > use Bio::SeqIO;
> > use strict;
> > my $file = $ARGV[0];
> > my $seqio = Bio::SeqIO->new(
> > 			-format	=> 'locuslink',
> > 			-file	=> $file
> > 			);
> >
> > while (my $seq = $seqio->next_seq()) {
> > 	my $acc = $seq->annotation();
> > 	print $seq->accession(), "\n";
> > 	}
> >
> > The two deep recursion warnings come as:
> >
> > 15601
> > Deep recursion on subroutine "Bio::SeqIO::locuslink::next_seq" at
> > C:/Perl/site/lib/Bio\SeqIO\locuslink.pm line 456, <GEN0> chunk 15465.
> > 15731
> > Deep recursion on subroutine "Bio::SeqIO::locuslink::next_seq" at
> > C:/Perl/site/lib/Bio\SeqIO\locuslink.pm line 456, <GEN0> chunk 15595.
> > 15874
> > 15890
> >
> > And the exception is thrown as:
> > 24785
> > 24786
> > 24787
> >
> > ------------- EXCEPTION  -------------
> > MSG: No LOCUSID in first line of record. Not LocusLink in my book.
> > STACK Bio::SeqIO::locuslink::next_seq
> > C:/Perl/site/lib/Bio\SeqIO\locuslink.pm:435
> > STACK toplevel testLL.pl:11
> >
> > --------------------------------------
> >
> > On Fri, 29 Nov 2002, Hilmar Lapp wrote:
> >
> >> I will check what's happening. There is a test case and a sample 
> >> file in the repository; while you're at it, do you see what the 
> >> fundamental difference is between the sample and the your input 
> >> file? (Or does the test fail as well for you?)
> >>
> >> 	-hilmar
> >>
> >>> -----Original Message-----
> >>> From: Paul Boutros [mailto:pcboutro@engmail.uwaterloo.ca]
> >>> Sent: Friday, November 29, 2002 9:21 AM
> >>> To: Hilmar Lapp
> >>> Cc: bioperl-l@bioperl.org
> >>> Subject: Re: [Bioperl-l] LocusLink IO
> >>>
> >>>
> >>> I tried again on today's (11/29/2002) LL_tmpl file and same error:
> >>>
> >>> C:\paul\dev\LocusLink>perl -w testLL.pl
> >>> Deep recursion on subroutine "Bio::SeqIO::locuslink::next_seq" at
> >>> C:/Perl/site/lib/Bio\SeqIO\locuslink.pm line 456, <GEN0> chunk 15465.
> >>> Deep recursion on subroutine "Bio::SeqIO::locuslink::next_seq" at
> >>> C:/Perl/site/lib/Bio\SeqIO\locuslink.pm line 456, <GEN0> chunk 15595.
> >>>
> >>> ------------- EXCEPTION  -------------
> >>> MSG: No LOCUSID in first line of record. Not LocusLink in my book.
> >>> STACK Bio::SeqIO::locuslink::next_seq
> >>> C:/Perl/site/lib/Bio\SeqIO\locuslink.pm:435
> >>> STACK toplevel testLL.pl:8
> >>>
> >>> --------------------------------------
> >>>
> >>>
> >>> On Thu, 28 Nov 2002, Hilmar Lapp wrote:
> >>>
> >>>> The input file needs to be the LL_tmpl or in that format. Does your
> >>>> input file satisfy this? (NCBI releases several files for LL. Many
> >>>> are in tab-format; the LL_tmpl format is a tagged-line format.)
> >>>>
> >>>> 	-hilmar
> >>>>
> >>>> On Thursday, November 28, 2002, at 03:05 PM, Paul Boutros wrote:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> I'm using the LocusLink SeqIO parser with a download of
> >>> LocusLink from
> >>>>> NCBI today (LL3_021128.txt).  When I just parse through
> >>> the file doing
> >>>>> nothing except for checking the organism annotation with:
> >>>>>
> >>>>> use Bio::SeqIO;
> >>>>>
> >>>>> my $seqio = Bio::SeqIO->new(
> >>>>> 			-format	=> 'locuslink',
> >>>>> 			-file	=> 'LL3_021128.txt'
> >>>>> 			);
> >>>>>
> >>>>> while (my $acc = $seqio->next_seq()->annotation()) {
> >>>>> 	if ($acc->get_Annotations('ORGANISM') =~ /rattus norvegicus/i) {
> >>>>> 		print "Rat!\n";
> >>>>> 		}
> >>>>> 	}
> >>>>>
> >>>>> I get:
> >>>>>
> >>>>> Deep recursion on subroutine "Bio::SeqIO::locuslink::next_seq" at
> >>>>> C:/Perl/site/lib/Bio\SeqIO\locuslink.pm line 456, <GEN0>
> >>> chunk 15465.
> >>>>> Deep recursion on subroutine "Bio::SeqIO::locuslink::next_seq" at
> >>>>> C:/Perl/site/lib/Bio\SeqIO\locuslink.pm line 456, <GEN0>
> >>> chunk 15595.
> >>>>>
> >>>>>
> >>>>> ------------- EXCEPTION  -------------
> >>>>> MSG: No LOCUSID in first line of record. Not LocusLink in my book.
> >>>>> STACK Bio::SeqIO::locuslink::next_seq
> >>>>> C:/Perl/site/lib/Bio\SeqIO\locuslink.pm:435
> >>>>> STACK toplevel testLL.pl:10
> >>>>> --------------------------------------
> >>>>>
> >>>>> Interestingly enough I also don't get any output from the
> >>> ORGANISM
> >>>>> check,
> >>>>> so I must be doing that wrong, too.  I notice that the thing
> >>>>> processes a
> >>>>> fair chunk of time before spitting out the two "Deep recursion"
> >>>>> warnings,
> >>>>> and then a fair bit longer before hitting the exception.
> >>>>>
> >>>>> Any ideas if I'm doing something unusual, or if maybe I should
> >>>>> submit this
> >>>>> as a bug report?
> >>>>>
> >>>>> Paul
> >>>>>
> >>>>> OS: Win XP SP 1 and Win2K SP2
> >>>>> Perl: 5.8.0 and 5.6.1
> >>>>> BioPerl: CVS yesterday
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioperl-l mailing list
> >>>>> Bioperl-l@bioperl.org
> >>>>> http://bioperl.org/mailman/listinfo/bioperl-l
> >>>>>
> >>>> --
> >>>> -------------------------------------------------------------
> >>>> Hilmar Lapp                            email: lapp at gnf.org
> >>>> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> >>>> -------------------------------------------------------------
> >>>>
> >>>
> >>>
> >>
> >
> >
> >
> --
> -------------------------------------------------------------
> Hilmar Lapp                            email: lapp at gnf.org
> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> -------------------------------------------------------------
>