[Bioperl-l] usage question, Genemark parser

Jason Stajich jason at cgt.mc.duke.edu
Thu Apr 24 22:31:49 EDT 2003


On Thu, 24 Apr 2003, Thomas Keller wrote:

> Greetings,
> I thought I would try the Genemark parser (Bio/Tools/Genemark.pm) with
> the prokaryotic model output, though the authors warn it may not work.
> I can dump the @exon_arr and it contains the location info, but then
> what? I'm stuck on the usage. (... I am new to using OOP)
> here's the code:
> use strict;
> use Bio::Tools::Genemark;
> use Bio::Seq;
  use Bio::SeqIO;
>
> my $results = $ARGV[0] ;
>
> # genome sequence file from which I hope to extract the CDS segments
> my $source = "~/Documents/Consults/GenePrediction_out/0209_cat1_37.fa";
>
> my $Genemark = Bio::Tools::Genemark->new(-file => $results);
> my $seqio = Bio::SeqIO->new( -format => 'fasta', -file => "$source");
#                                          ^^^^^-- not EMBL if this is really a .fa (fasta) file
> my $seqobj = $seqio->next_seq();
>
# you can also use Bio::DB::Flat if you want RandomAccess to a sequences
# from a database
# you can get thise from $gene->seq_id I >>think<<

my $cdna_out = new Bio::SeqIO(-file => ">cdna.fa");
my $pep_out  = new Bio::SeqIO(-file => ">pep.fa");
> while(my $gene = $Genemark->next_prediction()) {
        $seqobj->add_SeqFeature($gene);
        my $cDNA = $gene->cds();
	my $pep  = $cDNA->translate(); # or $gene->protein();
        # there is also an mRNA method but I don't think
	# Genemark predicts UTRs ?
        $cDNA->display_id("geneXXX-cDNA"); # or however you want to uniquely identify
        $pep->display_id("geneXXX-pep");   # ditto
        $cdna_out->write_seq($cDNA);
        $pep_out->write_seq($pep);
> 	my @exon_arr = $gene->exons();
>      foreach my $cds (@exon_arr) {


>      	#use the location info
>      	#to get the sequence string from the $seqobj
>      	#how?
>      }
> }
>
> # the next step is to design primers for each putative gene sequence.
> I'd love suggestions for that as well.
>
> __DATA__
> the  prokaryotic output looks like:
>
> GeneMark.hmm PROKARYOTIC (Version 2.1)
> Sequence file name: sequence,	RBS: Y
> Model file name: pseudonative.mod_iteration_4
> Model organism: Pseudonative.model
> Wed Jan 15 14:38:57 2003
>
> Predicted genes
>     Gene    Strand    LeftEnd    RightEnd       Gene     Class
>      #                                         Length
>      1        -           1         591          591        1
>      2        -         605         940          336        1
>      3        -        1631        1726           96        1
>      4        -        2401        2853          453        1
>      5        +        3104        3697          594        1
>      6        +        3703        3897          195        1
>      7        -        4061        4708          648        1
>
> Thanks for your help.
> Tom K
> Tom Keller, Ph.D.
> http://www.ohsu.edu/core
> kellert at ohsu.edu
> 503-494-2442
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list