[Bioperl-l] Intron and exon information
David Block
dblock@gnf.org
Fri, 14 Jun 2002 08:29:41 -0700
A GCG->GeneStructure parser would be great - I'm sure it would find a use somewhere :)
If you could take your input files as a model and extend the parser:
- read the case of the sequence (regex)
- deduce exon/intron from the case
- create a parent GeneStructure::Gene object, and then add the exons
by their start/stop coordinates to the Gene
- after the exons are in, you can just ask for the introns and get them as a list,
IIRC.
- Once this is done, submit the code to the list, we'll take a look at it and patch
it into the head of CVS, somewhere where it belongs (right, Jason?)
Good luck - let us know how it's going...
--
David Block dblock@gnf.org
GNF - San Diego, CA http://www.gnf.org
Genome Informatics / Enterprise Programming
> -----Original Message-----
> From: Lars G. T. Jorgensen [mailto:larsj@diku.dk]
> Sent: Friday, June 14, 2002 8:14 AM
> To: bioperl-l@bioperl.org
> Subject: Re: [Bioperl-l] Intron and exon information
>
>
> "David Block" <dblock@gnf.org> writes:
>
> > This is kind of in my area - although I've been stuck in
> Java-land for a while.
> >
> > These 'files' of yours - what format are they in - just FASTA?
>
> The datafiles are outputs from the GCG suite. The Seq object accepts
> the sequences so thats fine. But, they use casing for representing
> exons/introns and the SeqIO::gcg throws this information away.
>
> So I was thinking about patching the parser, but I don't know if that
> is against the Design to let the SeqIO add features to a Seq object.
>
> But I think GCG can do FASTA output. Does this contain information
> about introns?
>
> BTW. Is there a printable class diagram somewhere. We don't have a A0
> printer here...
>
> >
> > The alignments are done, right? So what you need to do is
> figure out where the introns are, and then deduce the phase?
> >
> > You're going to have to create a
> Bio::SeqFeature::GeneStructure::Gene object, and use its
> intron capabilities. Take a look at the perldoc for that
> module, see if you can shoehorn your data into there, and
> then I think Hilmar's excellent work will give you the intron data.
> >
> > Let us know if this helps.
>
> --
> Mvh|Regards, Lars
> System administrator | Student
> Bioinformatics Centre | Department of Computer Science
> University of Copenhagen | University of Copenhagen
> http://www.binf.ku.dk | http://www.diku.dk
> When's the last time you used duct tape on a duct? -- Larry Wall
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>