[Bioperl-l] extract information with Bio::DB::GFF3 from gff3 file

Scott Cain cain.cshl at gmail.com
Thu Jul 19 02:47:53 UTC 2007


[Please always reply to the mailing list so that answers can archived]


Yes, because commas are not allowed in GFF3 in an unescaped form.
Essentially, you are doing this with your GFF3:

  Name=receptor kinase ORK10;Name= putative

and when you do this:

  my ($name) = $gene->attributes('Name');

you are getting the first item in the list of names, and I suspect which
one you get is random.

To fix it, you need to replace the comma with %2C (the URL escape code
for a comma).  If you generated this GFF3, you will need to add a step
to URI encode your attribute strings.  If you got it from someone else,
you should point out to them that their GFF is flawed.

Scott


On Thu, 2007-07-19 at 10:32 +0800, Xianran Li wrote:
> However, the $name return the string "putative" rather than "receptor kinase ORK10". Is any particular reason? 
> 
> 
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> Assuming $gene is a Bio::DB::GFF::Feature object (there is no such thing
> as Bio::DB::GFF3), then you can use the attributes method to get
> anything in the ninth column:
> 
>   my ($name) = $gene->attributes('Name');
> 
> The parenthesis are needed around $name because the attributes method
> returns a list and the parens capture the first item of the list into
> $name.
> 
> Scott
> 
> 
> On Wed, 2007-07-18 at 13:55 +0800, Xianran Li wrote:
> > Hi,
> > 
> > I want to extract some infomation  from the gff3 file like:
> > 
> > 12001 . gene 854759 857385 . - . ID=12001.t00153;Name=receptor kinase ORK10, putative
> >    
> > The gene position can be reterived as $gene->start, but how can I get the annotation infomatin (receptor kinase ORK10) ?
> > 
> > Thanks for your help.
> > 
> > 
> > Xianran Li
> ----- Original Message ----- 
> From: "Scott Cain" <cain.cshl at gmail.com>
> To: "Xianran Li" <xianranli78 at yahoo.com.cn>
> Cc: <bioperl-l at lists.open-bio.org>
> Sent: Wednesday, July 18, 2007 9:10 PM
> Subject: Re: [Bioperl-l] extract information with Bio::DB::GFF3 fromgff3 file
> 
> 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l&#0;†Ûiÿ÷'™¨¥É¨h¡Ê&
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070718/86cf671f/attachment.sig>


More information about the Bioperl-l mailing list