[Bioperl-l] genomic coordinates always on the plus strand
Hermann Norpois
hnorpois at googlemail.com
Fri May 4 19:29:37 UTC 2012
Hello,
in the tutorial
http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequencesthere is a
script that retrieves genomic coordinates (see below). I tested
it with 14 geneIDs and got always coordinates on "plus strand" meaning
$from was always a lower number than $to. Principally this is nice but I
was surprised. This means that all by genes are (by chance) on the plus
strand or that there are 2 "coordinates" (one for the "plus" one for the
"minus" strand). Then it could be possible (theoretically and not very
likely) that there are two genes for one $from/$to pair (one on the plus
and one on the minus strand with the same coordinates with different IDs).
I did not find anything about this issue in the documentation or in the
archive. Could please anybody comment on this?
use strict;use Bio::DB::EntrezGene;
my $id = shift <http://perldoc.perl.org/functions/shift.html> or die
<http://perldoc.perl.org/functions/die.html> "Id?\n"; # use a Gene id
my $db = new Bio::DB::EntrezGene;
my $seq = $db->get_Seq_by_id($id);
my $ac = $seq->annotation;
for my $ann ($ac->get_Annotations('dblink')) {
if ($ann->database eq "Evidence Viewer") {
# get the sequence identifier, the start, and the stop
my ($contig,$from,$to) = $ann->url =~
/contig=([^&]+).+from=(\d+)&to=(\d+)/;
print <http://perldoc.perl.org/functions/print.html> "$contig\t$from\t$to\n";
}}
Thank you
Hermann Norpois
More information about the Bioperl-l
mailing list