[Bioperl-l] Re: Error parsing TIGR xml
Josh Lauricha
laurichj at bioinfo.ucr.edu
Tue Jun 22 13:35:54 EDT 2004
The TIGR parser in bioperl 1.4 doesn't parse the coordset files, it
parses the full-fledged
TIGR xml releases. Jason wrote an unpublished parser for the coordsets,
which while
similar files are different enough to really need a different parser.
One thing, is that IIRC, the coordsets do not contain any sequence
data, so you'll need to
also lookup the actual sequence.
On Jun 22, 2004, at 10:21 AM, Fernan Aguero wrote:
> Hi!
>
> I'm seeing an error while trying to parse a .coordset file
> from TIGR. It is my first attempt at using this kind of
> files, so perhaps I'm doing something wrong.
>
> Here's my brief script:
>
> #!/usr/bin/perl -w
>
> use strict;
> use Bio::SeqIO;
>
> my $seqio = Bio::SeqIO->new( -file => $ARGV[0], -format => 'tigr');
>
> Just trying to create a SeqIO object produces the following error:
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: [2]Required <ASMBL_ID> missing
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/Root/Root.pm:328
> STACK: Bio::SeqIO::tigr::throw
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/SeqIO/tigr.pm:1338
> STACK: Bio::SeqIO::tigr::_process_assembly
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/SeqIO/tigr.pm:522
> STACK: Bio::SeqIO::tigr::_process
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/SeqIO/tigr.pm:423
> STACK: Bio::SeqIO::tigr::_initialize
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/SeqIO/tigr.pm:90
> STACK: Bio::SeqIO::new
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/SeqIO.pm:358
> STACK: Bio::SeqIO::new
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/SeqIO.pm:378
> STACK: ./tigrxml2features.pl:6
> -----------------------------------------------------------
>
>
> The file does contain ASMBL_IDs, or at least that is what I
> believe. These are the first lines of the file
>
> <ASSEMBLY ASMBL_ID = "56" COORDS = "1-2149">
> <HEADER>
> <CLONE_NAME>1047053397923</CLONE_NAME>
> <ORGANISM>Trypanosoma cruzi</ORGANISM>
> <AUTHOR_LIST CONTACT = "">
> </AUTHOR_LIST>
> </HEADER>
> <TU FEAT_NAME = "56.t00001" LOCUS = "Tc00.1047053397923.10"
> PUB_LOCUS =
> "" ALT_LOCUS = "" COM_NAME = "hypothetical protein" PUB_COMMENT = ""
> COORDS = "1
> 67-586">
> <MODEL FEAT_NAME = "56.m00001" COMMENT = "" COORDS =
> "167-586">
>
> <PROTEIN_SEQ>MKQSSTDGGGKQKGKDSVSSDSMKDAVTDNPGKPTATTIPTSR
> SGDAQEKEGKDDGTDERPTSKKHNSSPETGNTNDALTASENTPQTAETTATTVAKKNDTTIGDSDGSTAVS
> DTASPLLLL
> FLVVVACAAAAAVVAA*</PROTEIN_SEQ>
> <EXON FEAT_NAME = "56.e00001" COORDS =
> "167-586">
> <CDS FEAT_NAME = "56.c00001" COORDS =
> "167-586"/
>>
> </EXON>
> </MODEL>
> </TU>
> </ASSEMBLY>
>
> I've found a mention of a tigrxml by Jason Stajich that
> was supposed to be different from the SeqIO::tigr by Josh
> Lauricha. But I don't seem to have it in my system
> (bioperl-1.4)
> <http://bioperl.org/pipermail/bioperl-l/2004-January/014491.html>
>
> Thanks in advance,
>
> Fernan
>
> PS: I'm CCing the author of the tigr.pm module, just in
> case.
>
> --
> F e r n a n A g u e r o
> http://genoma.unsam.edu.ar/~fernan
>
Josh Lauricha
laurichj at bioinfo.ucr.edu
OpenPGP: 5A0D 92D3 D093 79DE F724 1137 6DF1 B5EB D9CE AAA8
Josh Lauricha
laurichj at bioinfo.ucr.edu
OpenPGP: 5A0D 92D3 D093 79DE F724 1137 6DF1 B5EB D9CE AAA8
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 486 bytes
Desc: This is a digitally signed message part
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20040622/ebe77cc8/PGP-0001.bin
More information about the Bioperl-l
mailing list