[Bioperl-l] Indexing CDS file

Wed Feb 11 13:24:30 UTC 2009

I'm guessing that line would be similar to DBSOURCE in GenPept files.   
Could probably use Bio::Annotation::DBLink or Bio::Annotation::Target  
for it (if it corresponds to a particular subset of the sequence).

chris

On Feb 11, 2009, at 6:44 AM, Heikki Lehvaslaiho wrote:

> Dave,
>
> Looks good. Are you going to do the changes in to the EMBL parser?
>
>   -Heikki
>
> 2009/2/11 Dave Messina <David.Messina at sbc.su.se>:
>> Thanks, Heikki.
>>
>> I took a closer look at the EBI ftp site where Sviya and I got the  
>> file, and
>> in their README (ftp://ftp.ebi.ac.uk/pub/databases/embl/cds/README.txt 
>> ) it
>> says:
>>
>> PA line - contains the accession.version of the "parent" EMBL entry
>>          (entry where the CDS is annotated)
>>
>>
>> So, unfortunately they've decided that a CDS record, which has no  
>> accession
>> of its own, doesn't get its parent's accession number, but gets to  
>> refer to
>> its parent's accession number via the PA line.
>>
>> Furthermore, there's an
>>
>> OX line - contains the NCBI taxid for the organism; taxonomic data  
>> are taken
>>          from the parent EMBL entries
>>
>> which is also not part of the the formal spec. (although this one  
>> is a more
>> worthwhile addition, IMO)
>>
>> Sooooo, I think we'll need to add support for these.
>>
>> 'PA' seems easy enough -- the EMBL parser can look for it if there  
>> isn't an
>> 'AC' line.
>>
>> As for 'OX', is there a standard slot for a taxonID in a RichSeq  
>> SeqFeature
>> table? Coming from a Genbank record or a vanilla EMBL record, this is
>> normally encoded as
>>
>> primary tag: source
>> tag: db_xref
>> value: taxon:9606
>>
>> right?
>>
>> Should do the same if we're coming from an EMBL entry, even though  
>> it's not
>> actually in the feature table?
>>
>>
>> Dave
>>
>>
>
>
>
> -- 
>    -Heikki
> Heikki Lehvaslaiho - heikki lehvaslaiho gmail com
> Sent from: Johannesburg Gauteng South Africa.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l