[Bioperl-l] Unusual behaviour of SeqIO::tigr
Josh Lauricha
laurichj at bioinfo.ucr.edu
Fri Mar 5 13:45:10 EST 2004
On Thu 03/04/04 16:57, Morten Lindow wrote:
> I am trying to build a table of genomic features from the tigrxml-format
> of a rice (pseudo)chromosome.
>
> However the tigr-parser seem to behave differently from genbank/embl:
> tigr.pm considers every TU/etc an individual sequence, and hence resets
> its coordinate system every time it starts on a new TU.
>
> My question is:
> Is there a bioperl-way to get to the global coordinates, like when I am
> parsing a genbankfile of a whole chromosome?
Sure:
my $tigrin = Bio::SeqIO->new( -format => 'tigr', -file => 'chr01.xml');
while (my $seq = $tigrin->next_seq){
my ($source) = grep { $_->primary_tag() eq "source" }
$seq->get_SeqFeatures();
# These are the 5' and 3' ends of each TU
my($end5) = $source->get_tag_values('end5');
my($end3) = $source->get_tag_values('end3');
my($strand) = $end3 <=> $end5;
# Then foreach location just do:
my $loc = get_some_location....
$start = $end5 + ($loc->start() - 1)*$strand;
$end = $end5 + ($loc->end() - 1)*$strand;
...
}
This bit of code isn't tested, but I've been using these alot, so it is
probably correct.
--
------------------------------------------------------
| Josh Lauricha | Ford, your turning into |
| laurichj at bioinfo.ucr.edu | a penguin. Stop it. |
| Bioinformatics, UCR | |
|----------------------------------------------------|
| OpenPG: |
| 5A0D 92D3 D093 79DE F724 1137 6DF1 B5EB D9CE AAA8 |
|----------------------------------------------------|
More information about the Bioperl-l
mailing list