[Bioperl-l] problem with swissprot parsin
Siddhartha Basu
basu at pharm.sunysb.edu
Thu Oct 14 16:15:10 EDT 2004
Hi Brian,
Here is the code that started to give the following error. I presume i
am using Bio::DB::Flat::BDB though i haven't called it directly. I am
trying to index swissprot/trembl files here.
#!/usr/bin/perl -w
use strict;
use Bio::DB::Flat;
die "no files\n" unless @ARGV;
my $LOCATION = "/home/basu/odbaindex";
my $db = Bio::DB::Flat->new( -directory => $LOCATION,
-dbname => "swissall",
-format => "swiss",
-index => "bdb",
-write_flag => 1,
) or die "can't create BioFlat indexes\n";
$db->build_index(@files);
print "Done indexing\n";
exit;
I get the following warinings.
======================================================================
Use of uninitialized value in substitution (s///) at
/usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
18676877.
Use of uninitialized value in substitution (s///) at
/usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
18676916.
Use of uninitialized value in substitution (s///) at
/usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
18676956.
Use of uninitialized value in substitution (s///) at
/usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
18677002.
Use of uninitialized value in substitution (s///) at
/usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
=========================================================================
I have done a small test with Bio::SeqIO module using a small test
file(swiss.test). Here is the code.
#!/usr/bin/perl -w
#
use strict;
use Bio::SeqIO;
my $seq = Bio::SeqIO->new(-file => $ARGV[0], -format => "swiss");
while (my $in = $seq->next_seq) {
print $in->id,"\n";
}
exit;
It gives the same error
Use of uninitialized value in substitution (s///) at
/usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN0> line 28.
1433_CAEEL
Use of uninitialized value in substitution (s///) at
/usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN0> line 87.
A4_CAEEL
Use of uninitialized value in substitution (s///) at
/usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN0> line 171.
AATC_CAEEL
I have also attached the test file.
Hope this will give some clue for the problem.
Thanks for the response.
siddhartha
Brian Osborne wrote:
> Siddhartha,
>
> Bio::DB::Flat::BinarySearch or Bio::DB::Flat::BDB? Also, please show your
> code when you ask a question, it simplifies matters. For example, it would
> tell me which module you used, which file format, and so on. It also helps
> to attach the actual sequence files, or some smaller test file that shows
> the same error. What happens occasionally is that a question will get
> ignored for the simple reason that no one knows how to answer, there's not
> enough information given in the letter.
>
> Brian O.
>
> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org
> [mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Siddhartha Basu
> Sent: Thursday, October 14, 2004 2:51 PM
> To: bioperl-l at bioperl.org
> Subject: [Bioperl-l] problem with swissprot parsin
>
> Hi,
> I have already described this problem in this mailing list but haven't
> got anybodies attention yet. I had also asked the author of this module
> but nothing back yet. Anyway i really could'nt figure out how to solve
> this and so i am writing again. I have also tried by replacing the
> swiss.pm module from the bioperl-live version. But the problem persists.
> I understand that this is a maintained module and i am not getting
> ignored because of maintenance issue.
>
> I am trying to make a flat file index of swissprot/trembl files using
> Bio::DB::Flat module. However, i am getting the following consistent
> warnings during the indexing process.
> ======================================================================
> Use of uninitialized value in substitution (s///) at
> /usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
> 18676877.
> Use of uninitialized value in substitution (s///) at
> /usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
> 18676916.
> Use of uninitialized value in substitution (s///) at
> /usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
> 18676956.
> Use of uninitialized value in substitution (s///) at
> /usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
> 18677002.
> Use of uninitialized value in substitution (s///) at
> /usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
> 18677045.
> Use of uninitialized value in substitution (s///) at
> /usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
> 18677091.
> Use of uninitialized value in substitution (s///) at
> /usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
> 18677136.
> Use of uninitialized value in substitution (s///) at
> /usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
> 18677178.
> Use of uninitialized value in substitution (s///) at
> /usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
> 18677209.
> Use of uninitialized value in substitution (s///) at
> /usr/lib/perl5/site_perl/5.8.3/Bio/SeqIO/swiss.pm line 855, <GEN1> line
> 18677249.
> ========================================================================
> Though, the indexing get completed, i could'nt fetch any data from there
> as it does not return any seq obj.
> I also get the same warnings when i try to read the swissprot file using
> the Bio::SeqIO module.
> I am using bioperl-1.4 and understand it has something to do with the
> swissprot parser in Seq::IO module.
> So, does any fix or solution available for this problem.
>
> Thanks in advance.
>
> -siddhartha
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
-------------- next part --------------
ID 1433_CAEEL STANDARD; PRT; 248 AA.
AC P41932; Q21537;
DT 01-NOV-1995 (Rel. 32, Created)
DT 01-NOV-1995 (Rel. 32, Last sequence update)
DT 01-OCT-2004 (Rel. 45, Last annotation update)
DE 14-3-3-like protein 1.
GN Name=ftt-1; ORFNames=M117.2;
OS Caenorhabditis elegans.
OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea;
OC Rhabditidae; Peloderinae; Caenorhabditis.
OX NCBI_TaxID=6239;
RN [1]
RP SEQUENCE FROM N.A.
RC STRAIN=Bristol N2;
RX MEDLINE=95011616; PubMed=7926802; DOI=10.1016/0378-1119(94)90068-X;
RA Wang W., Shakes D.C.;
RT "Isolation and sequence analysis of a Caenorhabditis elegans cDNA
RT which encodes a 14-3-3 homologue.";
RL Gene 147:215-218(1994).
RN [2]
RP SEQUENCE FROM N.A.
RC STRAIN=Bristol N2;
RX MEDLINE=99069613; PubMed=9851916;
RG THE C. ELEGANS SEQUENCING CONSORTIUM;
RT "Genome sequence of the nematode C. elegans: a platform for
RT investigating biology.";
RL Science 282:2012-2018(1998).
CC -!- SIMILARITY: Belongs to the 14-3-3 family.
CC --------------------------------------------------------------------------
CC This SWISS-PROT entry is copyright. It is produced through a collaboration
CC between the Swiss Institute of Bioinformatics and the EMBL outstation -
CC the European Bioinformatics Institute. There are no restrictions on its
CC use by non-profit institutions as long as its content is in no way
CC modified and this statement is not removed. Usage by and for commercial
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/
CC or send an email to license at isb-sib.ch).
CC --------------------------------------------------------------------------
DR EMBL; U05038; AAA61872.1; -.
DR EMBL; Z73910; CAA98138.1; -.
DR PIR; JC2581; JC2581.
DR PIR; T23759; T23759.
DR HSSP; P93343; 1O9E.
DR IntAct; P41932; -.
DR WormPep; M117.2; CE06200.
DR InterPro; IPR000308; 14-3-3.
DR Pfam; PF00244; 14-3-3; 1.
DR PRINTS; PR00305; 1433ZETA.
DR SMART; SM00101; 14_3_3; 1.
DR PROSITE; PS00796; 1433_1; 1.
DR PROSITE; PS00797; 1433_2; 1.
KW Multigene family.
FT CONFLICT 118 118 A -> V (in Ref. 2).
SQ SEQUENCE 248 AA; 28162 MW; B9350039628341AF CRC64;
MSDTVEELVQ RAKLAEQAER YDDMAAAMKK VTEQGQELSN EERNLLSVAY KNVVGARRSS
WRVISSIEQK TEGSEKKQQL AKEYRVKVEQ ELNDICQDVL KLLDEFLIVK AGAAESKAFY
LKMKGDYYRY LAEVASEDRA AVVEKSQKAY QEALDIAKDK MQPTHPIRLG LALNFSVFYY
EILNTPEHAC QLAKQAFDDA IAELDTLNED SYKDSTLIMQ LLRDNLTLWT SDVGAEDQEQ
EGNQEAGN
//
ID A4_CAEEL STANDARD; PRT; 686 AA.
AC Q10651; Q18583; Q95ZX1;
DT 28-FEB-2003 (Rel. 41, Created)
DT 28-FEB-2003 (Rel. 41, Last sequence update)
DT 01-OCT-2004 (Rel. 45, Last annotation update)
DE Beta-amyloid-like protein precursor.
GN Name=apl-1; ORFNames=C42D8.8;
OS Caenorhabditis elegans.
OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea;
OC Rhabditidae; Peloderinae; Caenorhabditis.
OX NCBI_TaxID=6239;
RN [1]
RP SEQUENCE OF 6-686 FROM N.A.
RC STRAIN=Bristol N2;
RX MEDLINE=94089766; PubMed=8265668;
RA Daigle I., Li C.;
RT "apl-1, a Caenorhabditis elegans gene encoding a protein related to
RT the human beta-amyloid protein precursor.";
RL Proc. Natl. Acad. Sci. U.S.A. 90:12045-12049(1993).
RN [2]
RP SEQUENCE FROM N.A.
RC STRAIN=Bristol N2;
RX MEDLINE=99069613; PubMed=9851916;
RG THE C. ELEGANS SEQUENCING CONSORTIUM;
RT "Genome sequence of the nematode C. elegans: a platform for
RT investigating biology.";
RL Science 282:2012-2018(1998).
RN [3]
RP REVISIONS, AND ALTERNATIVE SPLICING.
RA Waterston R.;
RL Submitted (JUN-2001) to the EMBL/GenBank/DDBJ databases.
CC -!- SUBCELLULAR LOCATION: Type I membrane protein (Potential).
CC -!- ALTERNATIVE PRODUCTS:
CC Event=Alternative splicing; Named isoforms=2;
CC Name=a;
CC IsoId=Q10651-1; Sequence=Displayed;
CC Name=b;
CC IsoId=Q10651-2; Sequence=VSP_000017;
CC Note=No experimental confirmation available;
CC -!- SIMILARITY: Belongs to the APP family.
CC --------------------------------------------------------------------------
CC This SWISS-PROT entry is copyright. It is produced through a collaboration
CC between the Swiss Institute of Bioinformatics and the EMBL outstation -
CC the European Bioinformatics Institute. There are no restrictions on its
CC use by non-profit institutions as long as its content is in no way
CC modified and this statement is not removed. Usage by and for commercial
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/
CC or send an email to license at isb-sib.ch).
CC --------------------------------------------------------------------------
DR EMBL; U00240; AAC46470.1; ALT_INIT.
DR EMBL; U56966; AAA98722.1; -.
DR EMBL; U56966; AAK68242.1; -.
DR PIR; T15795; T15795.
DR HSSP; P05067; 1MWP.
DR WormPep; C42D8.8a; CE04209.
DR WormPep; C42D8.8b; CE27845.
DR InterPro; IPR008155; A4_APP.
DR InterPro; IPR008154; A4_extra.
DR Pfam; PF02177; A4_EXTRA; 1.
DR PRINTS; PR00203; AMYLOIDA4.
DR SMART; SM00006; A4_EXTRA; 1.
DR PROSITE; PS00319; A4_EXTRA; 1.
KW Alternative splicing; Amyloid; Glycoprotein; Neurogenesis; Signal;
KW Transmembrane.
FT SIGNAL 1 21 Potential.
FT CHAIN 22 686 Beta-amyloid-like protein.
FT DOMAIN 22 621 Extracellular (Potential).
FT TRANSMEM 622 642 Potential.
FT DOMAIN 643 686 Cytoplasmic (Potential).
FT DOMAIN 205 228 Asp-rich.
FT DOMAIN 676 679 Clathrin-binding (Potential).
FT CARBOHYD 84 84 N-linked (GlcNAc...) (Potential).
FT CARBOHYD 201 201 N-linked (GlcNAc...) (Potential).
FT CARBOHYD 249 249 N-linked (GlcNAc...) (Potential).
FT CARBOHYD 417 417 N-linked (GlcNAc...) (Potential).
FT VARSPLIC 538 539 Missing (in isoform b).
FT /FTId=VSP_000017.
SQ SEQUENCE 686 AA; 79434 MW; A0816858FDD48608 CRC64;
MTVGKLMIGL LIPILVATVY AEGSPAGSKR HEKFIPMVAF SCGYRNQYMT EEGSWKTDDE
RYATCFSGKL DILKYCRKAY PSMNITNIVE YSHEVSISDW CREEGSPCKW THSVRPYHCI
DGEFHSEALQ VPHDCQFSHV NSRDQCNDYQ HWKDEAGKQC KTKKSKGNKD MIVRSFAVLE
PCALDMFTGV EFVCCPNDQT NKTDVQKTKE DEDDDDDEDD AYEDDYSEES DEKDEEEPSS
QDPYFKIANW TNEHDDFKKA EMRMDEKHRK KVDKVMKEWG DLETRYNEQK AKDPKGAEKF
KSQMNARFQK TVSSLEEEHK RMRKEIEAVH EERVQAMLNE KKRDATHDYR QALATHVNKP
NKHSVLQSLK AYIRAEEKDR MHTLNRYRHL LKADSKEAAA YKPTVIHRLR YIDLRINGTL
AMLRDFPDLE KYVRPIAVTY WKDYRDEVSP DISVEDSELT PIIHDDEFSK NAKLDVKAPT
TTAKPVKETD NAKVLPTEAS DSEEEADEYY EDEDDEQVKK TPDMKKKVKV VDIKPKEIKV
TIEEEKKAPK LVETSVQTDD EDDDEDSSSS TSSESDEDED KNIKELRVDI EPIIDEPASF
YRHDKLIQSP EVERSASSVF QPYVLASAMF ITAICIIAFA ITNARRRRAM RGFIEVDVYT
PEERHVAGMQ VNGYENPTYS FFDSKA
//
ID AATC_CAEEL STANDARD; PRT; 408 AA.
AC Q22067;
DT 01-NOV-1997 (Rel. 35, Created)
DT 01-NOV-1997 (Rel. 35, Last sequence update)
DT 01-OCT-2004 (Rel. 45, Last annotation update)
DE Probable aspartate aminotransferase, cytoplasmic (EC 2.6.1.1)
DE (Transaminase A) (Glutamate oxaloacetate transaminase-1).
GN ORFNames=T01C8.5;
OS Caenorhabditis elegans.
OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea;
OC Rhabditidae; Peloderinae; Caenorhabditis.
OX NCBI_TaxID=6239;
RN [1]
RP SEQUENCE FROM N.A.
RC STRAIN=Bristol N2;
RX MEDLINE=99069613; PubMed=9851916;
RG THE C. ELEGANS SEQUENCING CONSORTIUM;
RT "Genome sequence of the nematode C. elegans: a platform for
RT investigating biology.";
RL Science 282:2012-2018(1998).
CC -!- CATALYTIC ACTIVITY: L-aspartate + 2-oxoglutarate = oxaloacetate +
CC L-glutamate.
CC -!- COFACTOR: Pyridoxal phosphate (By similarity).
CC -!- SUBUNIT: Homodimer (By similarity).
CC -!- SUBCELLULAR LOCATION: Cytoplasmic (Potential).
CC -!- MISCELLANEOUS: In eukaryotes there are cytoplasmic, mitochondrial
CC and chloroplastic isozymes.
CC -!- SIMILARITY: Belongs to the class-I pyridoxal-phosphate-dependent
CC aminotransferase family.
CC --------------------------------------------------------------------------
CC This SWISS-PROT entry is copyright. It is produced through a collaboration
CC between the Swiss Institute of Bioinformatics and the EMBL outstation -
CC the European Bioinformatics Institute. There are no restrictions on its
CC use by non-profit institutions as long as its content is in no way
CC modified and this statement is not removed. Usage by and for commercial
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/
CC or send an email to license at isb-sib.ch).
CC --------------------------------------------------------------------------
DR EMBL; U58726; AAB00578.1; -.
DR PIR; T29857; T29857.
DR HSSP; P00503; 1AJS.
DR WormPep; T01C8.5; CE07462.
DR InterPro; IPR004839; Aminotrans_I/II.
DR InterPro; IPR000796; Asp_trans.
DR InterPro; IPR004838; NHtransf_1_BS.
DR Pfam; PF00155; Aminotran_1_2; 1.
DR PRINTS; PR00799; TRANSAMINASE.
DR PROSITE; PS00105; AA_TRANSFER_CLASS_1; 1.
KW Aminotransferase; Pyridoxal phosphate; Transferase.
FT BINDING 251 251 Pyridoxal phosphate (By similarity).
SQ SEQUENCE 408 AA; 45493 MW; A4DDCBCB8C0EFD83 CRC64;
MSFFDGIPVA PPIEVFHKNK MYLDETAPVK VNLTIGAYRT EEGQPWVLPV VHETEVEIAN
DTSLNHEYLP VLGHEGFRKA ATELVLGAES PAIKEERSFG VQCLSGTGAL RAGAEFLASV
CNMKTVYVSN PTWGNHKLVF KKAGFTTVAD YTFWDYDNKR VHIEKFLSDL ESAPEKSVII
LHGCAHNPTG MDPTQEQWKL VAEVIKRKNL FTFFDIAYQG FASGDPAADA WAIRYFVDQG
MEMVVSQSFA KNFGLYNERV GNLTVVVNNP AVIAGFQSQM SLVIRANWSN PPAHGARIVH
KVLTTPARRE QWNQSIQAMS SRIKQMRAAL LRHLMDLGTP GTWDHIIQQI GMFSYTGLTS
AQVDHLIANH KVFLLRDGRI NICGLNTKNV EYVAKAIDET VRAVKSNI
//
More information about the Bioperl-l
mailing list