[Bioperl-l] Bio::Seq::RichSeq error
Heikki Lehvaslaiho
heikki at nildram.co.uk
Thu Dec 11 05:45:33 EST 2003
Fang,
It is not a bug but a feature. In EMBL, GenBank and Swiss-Prot parsers you'll
find these lines:
# Don't make a species object if it's empty or "Unknown" or "None"
return unless $genus and $genus !~ /^(Unknown|None)$/oi;
There are 58 entries with Unknown as the first word in the OS line in the
current EMBL databank. I would not be too difficult to modify the parsers to
include these, but would it be useful and how to do it?
binomial() should return a valid scientific name, so we should not use
species, I guess. Higher taxa might be of some use. We already have one
exception writen in Viri, but these unknown species are even fuzzier.
You someone can come up with a plan, I am happy to code it in.
-Heikki
On Thursday 11 Dec 2003 6:59 am, Magic Fang wrote:
> test file:
> LOCUS AY007677 1433 bp DNA linear BCT
> 29-OCT-2001 DEFINITION Unknown marine alpha proteobacterium JP66.1 16S
> ribosomal RNA, partial sequence.
> ACCESSION AY007677
> VERSION AY007677.1 GI:12000363
> KEYWORDS .
> SOURCE unknown marine alpha proteobacterium JP66.1
> ORGANISM unknown marine alpha proteobacterium JP66.1
> Bacteria; Proteobacteria; Alphaproteobacteria.
> REFERENCE 1 (bases 1 to 1433)
> AUTHORS Eilers,H., Pernthaler,J., Peplies,J., Glockner,F.O., Gerdts,G.
> and Amann,R.
> TITLE Isolation of novel pelagic bacteria from the German bight and
> their seasonal contributions to surface picoplankton
> JOURNAL Appl. Environ. Microbiol. 67 (11), 5134-5142 (2001)
> MEDLINE 21536174
> PUBMED 11679337
> REFERENCE 2 (bases 1 to 1433)
> AUTHORS Eilers,H., Pernthaler,J., Peplies,J., Gloeckner,F.O.,
> Gerdts,G., Schuett,C. and Amann,R.
> TITLE Identification and seasonal dominance of culturable marine
> bacteria JOURNAL Unpublished
> REFERENCE 3 (bases 1 to 1433)
> AUTHORS Eilers,H., Pernthaler,J., Peplies,J., Gloeckner,F.O.,
> Gerdts,G., Schuett,C. and Amann,R.
> TITLE Direct Submission
> JOURNAL Submitted (30-AUG-2000) Molecular Ecology,
> Max-Planck-Institute, Celsiusstrasse 1, Bremen 28359, Germany
> FEATURES Location/Qualifiers
> source 1..1433
> /organism="unknown marine alpha proteobacterium
> JP66.1" /mol_type="genomic DNA"
> /db_xref="taxon:145652"
> rRNA <1..>1433
> /product="16S ribosomal RNA"
> ORIGIN
> 1 tcatggctca gaacgaacgc tggcggcagg cttaacacat gcaagtcgaa cgatctcttc
> 61 ggagatagtg gcagacgggt gagtaacgcg tgggaaccta ccttattcta cggaataaca
> 121 gttagaaatg actgctaata ccgtatacgc ccttcggggg aaagatttat cggagtagga
> 181 tgggcccgcg ttggattagc tagttggtgg ggtaatggcc taccaaggcg acgatctata
> 241 gctggtctga gaggatgatc agccacactg gaactgagac acggtccaga ctcctacggg
> 301 aggcagcagt ggggaatatt ggacaatggg cgcaagcctg atccagccat gccgcctgag
> 361 tgatgaaggc cttagggttg taaagctctt tcaacggtga agataatgac ggtaaccgta
> 421 gaagaagccc cggctaactt cgtgccagca gccgcggtaa tacgaagggg gctagcgttg
> 481 ttcggaatta ctgggcgtaa agcgtacgta ggcggattag aaagttaggg gtgaaatccc
> 541 agggctcaac cctggaactg cctctaaaac tcctaatctt gagttcgaga gaggtgagtg
> 601 gaattccgag tgtagaggtg aaattcgtag atattcggag gaacaccagt ggcgaaggcg
> 661 gctcactggc tcgatactga cgctgaggta cgaaagcgtg gggagcaaac aggattagat
> 721 accctggtag tccacgccgt aaacgatgaa tgttagccgt cgggcagtat actgttcggt
> 781 ggcgcagcta acgcattaaa cattccgcct ggggagtacg gtcgcaagat taaaactcaa
> 841 aggaattgac gggggcccgc acaagcggtg gagcatgtgg tttaattcga agcaacgcgc
> 901 agaaccttac cagcccttga cataccaatc gcggttagtg gagacacttt ccttcagttc
> 961 ggctggattg gatacaggtg ctgcatggct gtcgtcagct cgtgtcgtga gatgttgggt
> 1021 taagtcccgc aacgagcgca accctcgcct ttagttgcca gcatttagtt gggcactcta
> 1081 gagggactgc cggtgataag ccggaggaag gtggggatga cgtcaagtcc tcatggccct
> 1141 tacgggctgg gctacacacg tgctacaatg gtggtgacag tgggcagcga gacggcaacg
> 1201 tcgagctaat ctccaaaaac catctcartt cggattgggg tctgcaactc gacccccatg
> 1261 aagttggaat cgctagtaat cgcggatcag catgccgcgg tgaatacgtt cccgggcctt
> 1321 gtacacaccg cccgtcacac catgggagtt ggtcttaccc gaaggcgatg cgctaaccag
> 1381 caatggaggc agtcgaccac ggtagggtca gcgactgggg tgaagtcgta aca
> //
>
> test command:
> $ perl -e 'use Bio::SeqIO;$in=Bio::SeqIO->new(-file=>"test.gbk",
> -format=>"genbank");$seq=$in->next_seq;print $seq->dis play_id, "\t",
> $seq->species->species, "\n";'
>
> error message:
> Can't call method "species" on an undefined value at -e line 1, <GEN0> line
> 61.
>
> bioperl version 1.3.01
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
--
______ _/ _/_____________________________________________________
_/ _/ http://www.ebi.ac.uk/mutations/
_/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk
_/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
_/ _/ _/ Wellcome Trust Genome Campus, Hinxton
_/ _/ _/ Cambs. CB10 1SD, United Kingdom
_/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________
More information about the Bioperl-l
mailing list