[Bioperl-l] genomic coordinates always on the plus strand

Hermann Norpois hnorpois at googlemail.com
Fri May 4 19:29:37 UTC 2012


Hello,

in the tutorial
http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequencesthere is a
script that retrieves genomic coordinates (see below). I tested
it with 14 geneIDs and got always coordinates on "plus strand" meaning
$from was always a lower number than $to. Principally this is nice but I
was surprised. This means that all by genes are (by chance) on the plus
strand or that there are 2 "coordinates" (one for the "plus" one for the
"minus" strand). Then it could be possible (theoretically and not very
likely) that there are two genes for one $from/$to pair (one on the plus
and one on the minus strand with the same coordinates with different IDs).
I did not find anything about this issue in the documentation or in the
archive. Could please anybody comment on this?

use strict;use Bio::DB::EntrezGene;
 my $id = shift <http://perldoc.perl.org/functions/shift.html> or die
<http://perldoc.perl.org/functions/die.html> "Id?\n"; # use a Gene id
 my $db = new Bio::DB::EntrezGene;
 my $seq = $db->get_Seq_by_id($id);
 my $ac = $seq->annotation;
 for my $ann ($ac->get_Annotations('dblink')) {
	if ($ann->database eq "Evidence Viewer") {
                # get the sequence identifier, the start, and the stop
		my ($contig,$from,$to) = $ann->url =~
		  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
		print <http://perldoc.perl.org/functions/print.html> "$contig\t$from\t$to\n";
	}}


Thank you
Hermann Norpois



More information about the Bioperl-l mailing list