[Bioperl-l] usage question, Genemark parser
Jason Stajich
jason at cgt.mc.duke.edu
Thu Apr 24 22:31:49 EDT 2003
On Thu, 24 Apr 2003, Thomas Keller wrote:
> Greetings,
> I thought I would try the Genemark parser (Bio/Tools/Genemark.pm) with
> the prokaryotic model output, though the authors warn it may not work.
> I can dump the @exon_arr and it contains the location info, but then
> what? I'm stuck on the usage. (... I am new to using OOP)
> here's the code:
> use strict;
> use Bio::Tools::Genemark;
> use Bio::Seq;
use Bio::SeqIO;
>
> my $results = $ARGV[0] ;
>
> # genome sequence file from which I hope to extract the CDS segments
> my $source = "~/Documents/Consults/GenePrediction_out/0209_cat1_37.fa";
>
> my $Genemark = Bio::Tools::Genemark->new(-file => $results);
> my $seqio = Bio::SeqIO->new( -format => 'fasta', -file => "$source");
# ^^^^^-- not EMBL if this is really a .fa (fasta) file
> my $seqobj = $seqio->next_seq();
>
# you can also use Bio::DB::Flat if you want RandomAccess to a sequences
# from a database
# you can get thise from $gene->seq_id I >>think<<
my $cdna_out = new Bio::SeqIO(-file => ">cdna.fa");
my $pep_out = new Bio::SeqIO(-file => ">pep.fa");
> while(my $gene = $Genemark->next_prediction()) {
$seqobj->add_SeqFeature($gene);
my $cDNA = $gene->cds();
my $pep = $cDNA->translate(); # or $gene->protein();
# there is also an mRNA method but I don't think
# Genemark predicts UTRs ?
$cDNA->display_id("geneXXX-cDNA"); # or however you want to uniquely identify
$pep->display_id("geneXXX-pep"); # ditto
$cdna_out->write_seq($cDNA);
$pep_out->write_seq($pep);
> my @exon_arr = $gene->exons();
> foreach my $cds (@exon_arr) {
> #use the location info
> #to get the sequence string from the $seqobj
> #how?
> }
> }
>
> # the next step is to design primers for each putative gene sequence.
> I'd love suggestions for that as well.
>
> __DATA__
> the prokaryotic output looks like:
>
> GeneMark.hmm PROKARYOTIC (Version 2.1)
> Sequence file name: sequence, RBS: Y
> Model file name: pseudonative.mod_iteration_4
> Model organism: Pseudonative.model
> Wed Jan 15 14:38:57 2003
>
> Predicted genes
> Gene Strand LeftEnd RightEnd Gene Class
> # Length
> 1 - 1 591 591 1
> 2 - 605 940 336 1
> 3 - 1631 1726 96 1
> 4 - 2401 2853 453 1
> 5 + 3104 3697 594 1
> 6 + 3703 3897 195 1
> 7 - 4061 4708 648 1
>
> Thanks for your help.
> Tom K
> Tom Keller, Ph.D.
> http://www.ohsu.edu/core
> kellert at ohsu.edu
> 503-494-2442
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
More information about the Bioperl-l
mailing list