[Bioperl-l] Homologene parser?

neeti somaiya neetisomaiya at gmail.com
Thu Aug 16 04:22:18 UTC 2007


Hi Siddhartha,

Thanks a lot for your mail.
It would be great if you could send me your parser, I will see how I can
modify it for my purpose.

Thanks and Regards,
Neeti.

On 8/14/07, Siddhartha Basu <basu at pharm.stonybrook.edu> wrote:
>
> neeti somaiya wrote:
> > Hi Andrew,
> >
> > I think the homologene data files have changed now on the ftp, from what
> you
> > had used.
> > It is now homologene.data and homologene.xml.
> > I tried using your parser, but because it was written on the file
> > hmlg.trip.ftp, it doesnt work anymore.
> >
> > I came across a parser
> >
> http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
> > .
> > I am looking at it to see if it works for me. NOt sure if it will.
> >
> > ~Neeti.
>
> Hi Neeti,
> I have recently written a parser for 'homologene' xml data specific for
> my purpose. I am not sure whether it will suit your purpose but it could
> be extended for general purpose parsing, so i am putting it forward.
> Here is how it works .......
>
> * It only parses a single homologene entry <HG-Entry>.....</HG-Entry>.
> * It does SAX based parsing (currently uses XML::SAX::ExpatXS)
> * Returns a graph(uses Graph module of perl) object where each node is a
> homologue entry with its corresponding entrez gene id. Each node also
> contain the following attributes ...
>         * Refseq protein id.
>         * Protein id (pid)
>         * ncbi taxon id.
> * The edge attribute contain information about the ortholog(true/false)
> relationship between two nodes.
> * The rest of tags currently are not being extracted. However, parsing
> the rest of the tags should not be very difficult.
>
> Generally i get homologene xml stream from an 'efetch' through
> Bio::DB::EUtilities, feed it to the parser, gets back 'Graph' object and
> then works on it.
>
> So, to make it more generic and work on local file
>
> * We need another class that reads the chunk between
> <HG-Entry>.....</HG-Entry> and sends it to the parser.
> * Add supports for most of the tags.
> * Massage the data to a bioperl compatible object.
>
> The first two i could work it out and for the last one i have to figure
> out the bioperl object that could be suitable (like  Bio::Cluster or
> Bio::NetWork::Node/Edge).
>
> Let me know if it sounds interesting and i will send you the code.
>
> -siddhartha
>
>
> >
> > On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
> >> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
> >>
> >>> Hi,
> >>>
> >>> Does anyone know of any Homologene parser, if available?
> >>> Please let me know.
> >>>
> >>> Thanks and Regards,
> >>> Neeti.
> >> Hi Neeti,
> >>
> >> Quite a long time ago now I wrote an Homologene parser and posted it
> >> to the mailing list:
> >>
> >> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
> >>
> >> I don't know if this still works but you could use it as a starting
> >> point. There may also be something newer out there too, I don't know.
> >> If you search the mailing list archives you'll get a few messages
> >> around the topic.
> >>
> >> Cheers, Andrew.
> >>
> >>
> >> Andrew Macgregor
> >> Centre for Comparative Genomics, Murdoch University
> >> Email: amacgregor at ccg.murdoch.edu.au
> >> Tel: (08) 9360 2961
> >>
> >>
> >>
> >>
> >
> >
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



-- 
-Neeti
Even my blood says, B positive



More information about the Bioperl-l mailing list