[Bioperl-l] Warning message terminates Stream_by_query?
Jason Stajich
jason at cgt.duhs.duke.edu
Wed Jun 2 17:44:37 EDT 2004
That is is annoying isn't it...
I am worried that you aren't able to download the entire sequence for
some reason and it is truncated.
The mesages shows it stopping in the middle of the tramslation:
translation="MTLSKETEVIFDWRRGVEYHSANPPLYDSSTFHQTSLG
GDVKYDYARSGNPNRELLEEKLARLEQGKFAFAFASGIAAISAVLLTFK
SGDHVILPDDVYGGTFRLTEQILNRFNIEFTTVDTTKLEQIEGAIQSNTK
LIYIETPSNPCFKITDIKAVSKIAEKHELLVAVDNTFMTPLGQSPLLLGAD
IVIHSATKFLSGHSDLIN
whereas it has a few more lines in the original chrom file.
I wonder if there is a problem downloading a whole chromsome record from
genbank - the web download is not the most reliable method and you'll find
like easier if you can download the .gbk files directly.
Depends on what you are working on I guess if you can predict the space of
accessions - if you are just working on finished/published genomes you can
grab stuff ftp://ftp.ncbi.nih.gov/genbank/genomes like this S.aureus
record and I bet you won't have the same problem.
-jason
On Wed, 2 Jun 2004, JAMES IBEN wrote:
> Hello list,
>
> I have written a program (my first) which takes a Genbank
> query and retrieves sequences to pull out an intergenic region
> that I would like to work with. However, when running the
> program I always at some point run into the following warning
> message:
>
> -------------------- WARNING ---------------------
> MSG: Unbalanced quote in:
> /locus_tag="SAV0358"
> /codon_start=1
> /transl_table=11
> /product="putative cystathionine beta-lyase"
> /protein_id="BAB56520.1"
> /db_xref="GI:14246126"
> /
> translation="MTLSKETEVIFDWRRGVEYHSANPPLYDSSTFHQTSLG
> GDVKYDYARSGNPNRELLEEKLARLEQGKFAFAFASGIAAISAVLLTFK
> SGDHVILPDDVYGGTFRLTEQILNRFNIEFTTVDTTKLEQIEGAIQSNTK
> LIYIETPSNPCFKITDIKAVSKIAEKHELLVAVDNTFMTPLGQSPLLLGAD
> IVIHSATKFLSGHSDLINo further qualifiers will be added for this
> feature
> ---------------------------------------------------
>
> With different querys, the message refers to some other
> Genbank sequence (i.e. not always this particular entry). The
> problem is that once I have run into this message, the
> seqence stream terminates, ending the program.
> I have checked these entries and see nothing apparantly
> wrong with them (everything is bounded by quotes). Can
> anyone tell me what this error arises from and perhaps what I
> can do to avoid it (or at least to skip any problematic
> sequences without interrupting the stream)?
> The querys I have been sumitting should only pull about 250
> sequences if they were not interrupted. Is there some sort of
> stream size limitation that I am hitting? If there is a problem
> with this approach is there a better solution for my particular
> task than using Stream_by_query?
>
> Thanks for your help,
> James
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
More information about the Bioperl-l
mailing list