[Bioperl-l] Protein Records without Sequence
    Warren Gallin 
    wgallin at ualberta.ca
       
    Wed Jun  5 18:16:57 UTC 2013
    
    
  
Hi,
I am encountering a problem with a number of protein records.
A HMMer search of the nr database returns a gi number and an associated sequence.
When I use that gi number to try to retrieve the full GENBANK record, however, there is no sequence returned with the record.
When I use the NCBI web interface and use that gi number the GENPEPT record returns with no sequence, but when I select fast format the sequence is returned.
Examples of gi numbers for which this occurs are:
23099847
21224301
68536697
46580017
77359109
Is this a flaw with the individual GENPEPT records?  In which case should I report it to NCBI?
Or are these some kind of "special" record that needs different parameters passed on the utilizes search?
There is a workaround, I guess, where is the sequence comes back empty then a new retrieval of fasta formatted records can be run and the empty field in the GENPEPT record repopulated, but this seems inelegant.
All advice and/or commentary appreciated.
Warren Gallin
    
    
More information about the Bioperl-l
mailing list