[Bioperl-l] automation of translation based on alignment
Ross KK Leung
ross at cuhk.edu.hk
Mon Mar 22 13:17:05 UTC 2010
Chris,
The following codes are what I use to retrieve sequences from GenBank. I
know that I can use something like:
for my $feature ($seqobj->get_SeqFeatures){
if ($feature->primary_tag eq "CDS") {
...
To get features, but how should
Bio::Location::SplitLocation
be used? Do you mean something like:
If ($seq->is_circular()) {
spliced_seq();
}
? But the genome indeed has several such spliced sequences then how can I
specify which is to retrieve? Thanks for your advice again~
#!/usr/bin/perl
use Bio::SeqIO::genbank; use Bio::DB::GenBank;
use Bio::DB::RefSeq;
$gb = new Bio::DB::GenBank;
my ($acc, $start, $stop) = @ARGV;
my $gb = Bio::DB::GenBank->new(-format => 'Fasta',
-seq_start => "$start",
-seq_stop => "$stop",
-strand => 1);
$gbout = $acc;
$seq = $gb->get_Seq_by_acc($acc);
print "seq is ", $seq->seq, "\n";
$seqio_obj = Bio::SeqIO->new(-file => ">$gbout.fa", -format => 'fasta' );
$seqio_obj->write_seq($seq);
exit;
-----Original Message-----
From: Chris Fields [mailto:cjfields at illinois.edu]
Sent: Monday, March 22, 2010 8:58 PM
To: Ross KK Leung
Cc: 'Florent Angly'; 'bioperl-l List'
Subject: Re: [Bioperl-l] automation of translation based on alignment
On Mar 22, 2010, at 12:30 AM, Ross KK Leung wrote:
> Dear Chris,
>
> It seems that Bioperl is "clever" enough to "rectify" my start and stop by
> reversing the order.
>
> e.g.
> start = 2300
> stop = 1600
>
> It will reverse back to 1600 and then 2300.
> What else to tell that I'm now working on a circular genome?
Reverse it where, the alignment or the feature? The svn version of BioPerl,
for alignments, retains strand information (this was a bug that was fixed).
For features, start is always less than end, with directionality determined
by strand. For a circular genome, the feature is split across the origin,
as you have seen in the original sequence you posted:
...
gene join(2307..3215,1..1623)
/gene="P"
...
This would be represented as a Bio::Location::SplitLocation in the feature;
it would joined based on that order if $seq->is_circular() is true (or at
least it should). In cases like this, the safe bet is to call spliced_seq()
to get the joined sequence in all cases, then call translate() to get the
protein sequence.
chris
More information about the Bioperl-l
mailing list