[Bioperl-l] Polyproteins, ribo slippage, and mat_peptide in viruses?

bill at genenformics.com bill at genenformics.com
Tue Oct 27 21:47:02 UTC 2009


These mature proteins do have gi.
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2accession.gz

>grep NC_001959 gene2accession
11983   1491970 REVIEWED        -       -       NP_786945.1     28416959  
     NC_001959.2     106060735       4       5373    +       -
11983   1491970 REVIEWED        -       -       NP_786946.1     28416960  
     NC_001959.2     106060735       4       5373    +       -
11983   1491970 REVIEWED        -       -       NP_786947.1     28416961  
     NC_001959.2     106060735       4       5373    +       -
11983   1491970 REVIEWED        -       -       NP_786948.1     28416962  
     NC_001959.2     106060735       4       5373    +       -
11983   1491970 REVIEWED        -       -       NP_786949.1     28416963  
     NC_001959.2     106060735       4       5373    +       -
11983   1491970 REVIEWED        -       -       NP_786950.1     28416964  
     NC_001959.2     106060735       4       5373    +       -
11983   1491971 PROVISIONAL     -       -       NP_056822.1     9630806
NC_001959.2     106060735       6949    7587    +       -
11983   1491972 PROVISIONAL     -       -       NP_056821.2     106060736 
     NC_001959.2     106060735       5357    6949    +       -

Bill
> On Tue, Oct 27, 2009 at 8:46 PM, Chris Fields <cjfields at illinois.edu>
> wrote:
>>>
>>> Ah. That's a shame. I did just take a few minutes to try out the
>>> EFetch idea (using Biopython) and it does work beautifully for
>>> this "nice" example virus which the NCBI have annotated.
>>
>> Interesting thing about that example: if you follow the hyperlinks for
>> the
>> mat_peptide feature key, they relate back to the full protein sequence
>> with
>> from/to, not to the protein_id for the feature.  Example:
>>
>> # link from the first mat_peptide
>> http://www.ncbi.nlm.nih.gov/protein/9630804?from=1&to=398&report=gpwithparts
>>
>> # protein_id
>> http://www.ncbi.nlm.nih.gov/protein/28416959
>
> Right - the protein ID link is just based on the GI number, 28416959.
> This link (or EFetch) gives you the (short) mature peptide.
>
>> This record doesn't appear to contain any mapping information along
>> those
>> lines, which makes me think this is an autogenerated record using the
>> Gene
>> record, which does have those mappings:
>>
>> http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene&cmd=Retrieve&dopt=full_report&list_uids=1491970
>
> Are you suggesting one option is (if the mat_peptide annotation
> is lacking a protein or GI number) to go online to the Gene
> database using the gene ID of the precursor (parent) protein
> to find the IDs of the mature (child) peptides?
>
> Peter
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>





More information about the Bioperl-l mailing list