[Bioperl-l] Get variation included in genbank file
Dave Messina
David.Messina at sbc.su.se
Wed Jun 9 17:20:12 UTC 2010
Hi again Jessica,
Forgive my slowness here, but is this what you want to do?
1) Start with an NM_ mRNA record
in your example, NM_001110556.1
2) Obtain the corresponding NG_ genomics locus record in Genbank format
which would correspond to the example file you attached. Accession number NG_011506
Is that right?
There are probably more clever ways to do this, but here's how I would approach it:
1) extract the GeneID dbxref from the NM_ mRNA record using Bio::SeqIO.
See http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Getting_the_Features
for details.
2) Use that to query the Gene database and get the related NG_ record
I don't know exactly what the field name is for the NG_ record, but you can list them all using this example:
http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#What_information_is_available_for_database_.27x.27.3F
and figure it out via trial and error.
3) Once you have the NG_ id, you can retrieve the genbank record
Here's the relevant example:
http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#Retrieve_raw_data_records_from_GenBank.2C_save_raw_data_to_file.2C_then_parse_via_Bio::SeqIO
So, by now it should be obvious that I'm presenting a general strategy. You'll have to do some legwork to get exactly what you want.
Good luck, and if you come up with a nice solution, please add it to the wiki!
Dave
> I would need to automatically get a gbk file like this with :Variation(dbSNP) included and correct mRNA/CDS regions, can it be done automatically using EUtilities, I am not sure about it.
>
> thx
>
>
> On Mon, Jun 7, 2010 at 5:18 PM, Dave Messina <David.Messina at sbc.su.se> wrote:
> Hi Jessica,
>
>
> > Does any know how to include variation(dbSNP) in the genbank file format
> > automatically using NM_ accession number using bioperl?
>
> I'm not sure I understand the question.
>
> As far as I know, Genbank records don't include SNP information. See for example the record for human p53 (which has SNPs):
>
> http://www.ncbi.nlm.nih.gov/nuccore/NM_000546.4
>
>
> I think though you should be able to get to a dbSNP record if you have a NM_ accession number using the BioPerl interface to NCBI's EUtilities.
>
> More information here:
>
> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>
>
> If that's not what you're after, could you clarify what you want to do?
>
>
> Dave
>
>
>
>
> --
> Jessica Jingping Sun
> <FLNA.gbk>
More information about the Bioperl-l
mailing list