[Bioperl-l] Re: Bio::Index:EMBL on embl flatfiles

Jonathan Miller millerj at bcm.tmc.edu
Mon Apr 4 13:44:51 EDT 2005


More specifically, the goal is to BLAST a
(local) fasta file; find the sequence location,
and look up its annotation in a (local) EMBL flatfile.

So, for example, for honeybee fasta file from EMBL:
Apis_mellifera.AMEL1.1.mar.dna.contig.fa,
the fasta header of the
contig where the sequence is found might be:

>Contig18.1.1312 dna:contig scaffold:AMEL1.1:Group1.1:1:1312:1

Now I want look up the annotation in the EMBL
flat files, (for example, Apis_mellifera.0.dat), that
I have indexed using Bio::Index:EMBL. 

However, the accession numbers in the EMBL flat files
have the form:  scaffold:AMEL1.1:Group1.10:1:348491:1

and apparently "scaffold:AMEL1.1:Group1.1:1:1312:1" never
appears in an EMBL flat file.

I don't know if I should be indexing differently,
searching on a different key, or what.

With NCBI fasta and GenBank flat files, this procedure
was straightfoward (e.g. no thought was required) 
to implement successfully. Presumably there is an 
analogous interface for the EMBL format?





More information about the Bioperl-l mailing list