[Bioperl-l] Get variation included in genbank file
Chris Fields
cjfields at illinois.edu
Thu Jun 10 00:06:42 UTC 2010
It's much easier to work with the GI than the accession. NCBI unfortunately just recently 'broke' their acc->gi stuff via efetch; you have to use rettype='seqid' and munge ASN.1 to get everything (though it is nice in a way for ID mapping).
After the initial step of grabbing the GI for NG_011506, though, you can use elink to grab the SNP IDs, then use efetch to get the actual SNP files, or esummary for the summary info.
#!/usr/bin/perl -w
use Modern::Perl;
use Bio::DB::EUtilities;
my $id = '224809339';
my $eutil = Bio::DB::EUtilities->new(-eutil => 'elink',
-id => $id,
-email => 'setyourown at foo.bar',
-verbose => 1,
-dbfrom => 'nuccore',
-db => 'snp',
-cmd => 'neighbor_history',
);
my $hist = $eutil->next_History || die "No history data returned";
$eutil->set_parameters(-eutil => 'efetch',
-history => $hist,
-retmode => 'text',
# 'chr', 'flt', 'brief', 'rsr', 'docset'
-rettype => 'chr'
);
$eutil->get_Response(-file => 'snps.txt');
# or ...
$eutil->set_parameters(-eutil => 'esummary',
-history => $hist,
);
$eutil->print_all;
# chris
On Jun 9, 2010, at 1:37 PM, Jessica Sun wrote:
> Thanks Dave.
> the variation information is not present in the version of NG_011506 I found
> at Genbank.) -- Yes, then if you click on the right side customer view there
> is a check box Features added by NCBI :209 snps, if you check that it will
> add all the variations in the gbk fomat. I found this would be a neat
> feature if it can automatically load by bioperl with an option turn on.
>
>
>
> On Wed, Jun 9, 2010 at 1:51 PM, Dave Messina <David.Messina at sbc.su.se>wrote:
>
>> Hi Jessica,
>>
>> Please keep the BioPerl list on the Cc line so everyone can follow along.
>>
>>
>>> Follow your approach it did not seem to me you can have Variation tag
>> included which
>>> list the know dbSNP location, id and allele changes?
>>
>> Ah okay, I assumed the file you attached was obtained directly from Genbank
>> and that the variation info therein was already included. (It appears that's
>> not the case — the variation information is not present in the version of
>> NG_011506 I found at Genbank.)
>>
>> If you want to include your own custom information in a genbank file,
>> you'll have to pull it out of dbSNP (or wherever the variation info is).
>> There are a couple of scripts that might be able to help with that (search
>> for snp):
>>
>> http://www.bioperl.org/wiki/Bioperl_scripts
>>
>>
>> You can then insert them into a RichSeq object as features and output in
>> genbank format. For that part, see the HOWTO:
>>
>> http://www.bioperl.org/wiki/HOWTO:Feature-Annotation
>>
>>
>> Dave
>>
>>
>
>
> --
> Jessica Jingping Sun
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list