[Bioperl-l] Bio::SeqIO::genbank, Bio::Species - can't get full
species name
James Wasmuth
james.wasmuth at ed.ac.uk
Thu May 13 09:51:57 EDT 2004
Hi Matthew,
I fixed this, its in the CVS for SeqIO/genbank.pm
I would give you the link but the docs have been playing up all day...
Let me know if it doesn't do what you want...
-james
Matthew Betts wrote:
>Hi,
>
>I am trying to reconcile gene trees with species trees, and to do this I
>need the species names to be the same in both cases. The gene trees come
>from a clustering of GenBank coding sequences, and the species trees come
>from the NCBI taxonomy. However, when using BioPerl to extract the species
>info from GenBank entries, it only seems possible to get the first
>three words from the ORGANISM line, which are treated as genus, species,
>and subspecies in Bio::Species. However, in several cases, such as the
>example below, there is more information in the ORGANISM line. I suspect
>that this means that the subspecies name uses more than one word, or that
>the GenBank format is being broken? However, this is also how the names
>appear in the NCBI taxonomy names.dmp file.
>
>The problem seems to be in Bio::SeqIO::genbank->_read_GenBank_Species().
>There is a special condition there for viruses (the whole of the ORGANISM
>info is put on to the classification array), but the examples I have are
>for chordates (there may be others).
>
>I'd be really grateful for any comments on the best thing for me to do.
>
>Thanks,
>
>Matthew
>
>
>
>LOCUS AY211864 701 bp DNA linear ROD 25-AUG-2003
>DEFINITION Tamias amoenus X Tamias ruficaudus RBCM19680 cytochrome b (cytb)
> gene, partial cds; mitochondrial gene for mitochondrial product.
>ACCESSION AY211864
>VERSION AY211864.1 GI:33385214
>KEYWORDS .
>SOURCE mitochondrion Tamias amoenus X Tamias ruficaudus
> ORGANISM Tamias amoenus X Tamias ruficaudus
> Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
> Mammalia; Eutheria; Rodentia; Sciurognathi; Sciuridae; Sciurinae;
> Tamias.
>REFERENCE 1 (bases 1 to 701)
> AUTHORS Good,J.M., Demboski,J.R., Nagorsen,D.W. and Sullivan,J.
> TITLE Phylogeography and introgressive hybridization: chipmunks (genus
> Tamias) in the northern Rocky Mountains
> JOURNAL Evolution 57 (8), 1900-1916 (2003)
>REFERENCE 2 (bases 1 to 701)
> AUTHORS Good,J.M., Demboski,J.R., Nagorsen,D.W. and Sullivan,J.
> TITLE Direct Submission
> JOURNAL Submitted (08-JAN-2003) Ecology and Evolutionary Biology,
> University of Arizona, 1041 E. Lowell Street, Tucson, AZ 85721, USA
>FEATURES Location/Qualifiers
> source 1..701
> /organism="Tamias amoenus X Tamias ruficaudus"
> /organelle="mitochondrion"
> /mol_type="genomic DNA"
> /specimen_voucher="Royal British Columbia Museum
> (RBCM19680)"
> /db_xref="taxon:231237"
> gene 1..>701
> /gene="cytb"
> CDS 1..>701
> /gene="cytb"
> /codon_start=1
> /transl_table=2
> /product="cytochrome b"
> /protein_id="AAP45298.1"
> /db_xref="GI:33385215"
> /translation="MTNIRKTHPLIKIINHSFIDLPAPSNISAWWNFGSLLGICLIIQ
> ILTGLFLAMHYTSDTMTAFSSVTHICRDVNYGWLIRYMHANGASMFFICLFLHVGRGL
> YYGSYTYFETWNIGVILLFAVMATAFMGYVLPWGQMSFWGATVITNLLSAIPYIGTTL
> VEWIWGGFSVDKATLTRFFAFHFILPFIITALVMVHLLFLHETGSNNPSGLISDSDKI
> PFHPYYTIKDILGILL"
>ORIGIN
> 1 atgacaaaca tccgcaaaac ccatcccctc attaaaatca ttaaccactc attcattgac
> 61 ttacccgcac catccaacat ttctgcatga tgaaattttg gatccctctt aggtatttgc
> 121 ctaattatcc aaattctcac tggactattc ctagcaatac actacacatc cgacacaatg
> 181 acagctttct catctgtcac tcatatttgc cgagatgtaa actacggctg acttatccga
> 241 tacatacacg ctaacggagc ctccatattt tttatctgcc tattccttca tgtaggccga
> 301 ggactttact atggatcata tacctacttc gaaacatgaa acattggagt aattctttta
> 361 ttcgccgtta tagccactgc atttataggt tacgttctcc catgaggaca gatatccttt
> 421 tgaggtgcta ctgttattac aaatctccta tcagccatcc catatatcgg aacaacacta
> 481 gtagaatgaa tctgaggagg cttctcagta gacaaagcca ctctaacacg attctttgca
> 541 tttcatttta tcctcccatt cattattaca gcattagtta tagttcacct actcttcctt
> 601 catgaaaccg gatccaataa tccttccgga ttaatctctg actctgataa aattccattc
> 661 catccatatt acactattaa agatatccta ggcatcctcc t
>//
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
"There are some days when I think I'm going to die from
an overdose of satisfaction."
--- Salvador Dali
Nematode Bioinformatics |
Blaxter Nematode Genomics Group |
School of Biological Sciences |
Ashworth Laboratories | tel: +44 131 650 7403
University of Edinburgh | web: www.nematodes.org
Edinburgh |
EH9 3JT |
UK |
More information about the Bioperl-l
mailing list