[Bioperl-l] How to get the intron phase

Jason Stajich jason.stajich at duke.edu
Fri Jul 8 11:12:04 EDT 2005


You can calculate it pretty easily, just build the split-location  
object from the location string.

use Bio::Factory::FTLocationFactory;
my $fh;
my $file = shift @ARGV;
open($fh, "grep '^>' $file") || die;

while(<$fh> ){
     if( /loc=(\S+):(\S+);/ ) {
      my ($seqid,$locationstr) = ( $1,$2);
     my $location = Bio::Factory::FTLocationFactory->from_string 
($locationstr);
   my $runninglength = 0;
    my $i = 0;
    my @exons =  $location->each_Location;
    my $last = scalar @exons;
    for my $exon (@exons) {
    # I may be sloppy here, pls check that this is working the way  
you expect
   # defining A^TG is phase 1 and AT^G is phase 2 i
    my $phase = ( $runninglength += $exon->length) % 3;
     if( $i != $last) {
      print "phase of intron $i is $phase\n";
     }
    $i++;
    }
     }
}

On Jul 8, 2005, at 10:39 AM, Filipe Garrett wrote:

> Hi all,
>
> I'm new to bioperl and I was looking for a way to obtain the intron  
> phases from genes in a FASTA format like this:
>
> >CG3427-RA type=transcript; loc=2R:complement 
> (2273725..2274587,2274647..2274996,2275280..2275413,2275634..2275804,2 
> 275864..2276117,2276188..2276549,2277349..2277510,2277748..2277924,227 
> 8864..2279008,2279228..2279373,2279935..2280127,2280182..2280323,22803 
> 92..2280478,2280739..2280836,2281121..2281172,2285453..2285599,2300275 
> ..2300819); ID=CG3427-RA; name=Epac-RA;  
> db_xref=FlyBase:FBtr0086132,FlyBase:FBgn0033102,Gadfly:CG3427-RA;  
> release=r4.1; species=dmel; len=4028
> CTCTCCAGCGGCGCACAACTCGATCGCTGGCCCAGAGGTTCAGTTCGGTT
> TGGTTCGGTTCGGTTTGAATCTCTGCCTCTGTTTACGCCTCTATATC...
>
> I've looked at the script directory and found the phase method  
> inside the Bio::SeqFeature::Gene::Intron object, but the examples  
> are from data parsed from a GFF file.
>
> Can I bypass the GFF stuff and use the FASTA header information  
> directly?
>
> Thanks in advance,
>
> Bests
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/




More information about the Bioperl-l mailing list