[Bioperl-l] How to get the intron phase
Jason Stajich
jason.stajich at duke.edu
Fri Jul 8 11:12:04 EDT 2005
You can calculate it pretty easily, just build the split-location
object from the location string.
use Bio::Factory::FTLocationFactory;
my $fh;
my $file = shift @ARGV;
open($fh, "grep '^>' $file") || die;
while(<$fh> ){
if( /loc=(\S+):(\S+);/ ) {
my ($seqid,$locationstr) = ( $1,$2);
my $location = Bio::Factory::FTLocationFactory->from_string
($locationstr);
my $runninglength = 0;
my $i = 0;
my @exons = $location->each_Location;
my $last = scalar @exons;
for my $exon (@exons) {
# I may be sloppy here, pls check that this is working the way
you expect
# defining A^TG is phase 1 and AT^G is phase 2 i
my $phase = ( $runninglength += $exon->length) % 3;
if( $i != $last) {
print "phase of intron $i is $phase\n";
}
$i++;
}
}
}
On Jul 8, 2005, at 10:39 AM, Filipe Garrett wrote:
> Hi all,
>
> I'm new to bioperl and I was looking for a way to obtain the intron
> phases from genes in a FASTA format like this:
>
> >CG3427-RA type=transcript; loc=2R:complement
> (2273725..2274587,2274647..2274996,2275280..2275413,2275634..2275804,2
> 275864..2276117,2276188..2276549,2277349..2277510,2277748..2277924,227
> 8864..2279008,2279228..2279373,2279935..2280127,2280182..2280323,22803
> 92..2280478,2280739..2280836,2281121..2281172,2285453..2285599,2300275
> ..2300819); ID=CG3427-RA; name=Epac-RA;
> db_xref=FlyBase:FBtr0086132,FlyBase:FBgn0033102,Gadfly:CG3427-RA;
> release=r4.1; species=dmel; len=4028
> CTCTCCAGCGGCGCACAACTCGATCGCTGGCCCAGAGGTTCAGTTCGGTT
> TGGTTCGGTTCGGTTTGAATCTCTGCCTCTGTTTACGCCTCTATATC...
>
> I've looked at the script directory and found the phase method
> inside the Bio::SeqFeature::Gene::Intron object, but the examples
> are from data parsed from a GFF file.
>
> Can I bypass the GFF stuff and use the FASTA header information
> directly?
>
> Thanks in advance,
>
> Bests
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/
More information about the Bioperl-l
mailing list