[Bioperl-l] Entrez Gene and bioperl-db

Hilmar Lapp hlapp at gmx.net
Tue Dec 28 01:17:42 EST 2004


Great to hear that someone is giving this a shot. Yes at this point is  
appears that NCBI is only offering the ASN.1, not a conversion to XML.  
Their asn2xml tool will not work with this ASN.1 format either, just  
checked it to be sure. They do seem to be mulling the option of XML  
though on the Gene FAQ. Maybe if enough people get in their ears they  
will spend some effort towards that. After all, the entrez gene web  
interface can display XML on demand - even though it looks fairly  
hideous.

There is no ASN.1 support in bioperl at all. Also, ASN.1 support in  
perl is actually thin - there is Convert::ASN1 at version 0.18 two  
years ago that I could find ... doesn't make me feel warm and fuzzy.

In the absence of any XML available from NCBI, gene_info might be the  
best start. An option could be to check for the presence of the other  
tab-delimited files and use those that are present. These are  
tab-delimited and hence the format itself is trivial so you can focus  
entirely on setting up a Bio::Seq plus annotation that's  
comparable/compatible to what the current SeqIO::locuslink does.

My $0.02 (worth less and less almost every day).

	-hilmar

On Thursday, December 23, 2004, at 10:51  AM, Peter Robinson wrote:

> Hi,
>
> I have been thinking about given a BioPerl EntrezGene parser a try  
> since
> I have been a heavy user of locus link to date. One issue is that the
> files that correspond to LL_tmpl (which was a flat file) are now in asn
> format
> http://www.ncbi.nlm.nih.gov/entrez/query/static/help/ 
> genehelp.html#query
> Although I saw some mention of ASN support in Bioperl by googling, I
> can't seem to find any module that does this in the present
> distribution. What is the status on that? In any case, I will be  
> working
> on this in the next month or two and if anything nice comes of it I  
> will
> send it to you / BioPerpl.
>
> best wishes & happy holidays
>
> Peter
>
> On Tue, 2004-12-14 at 09:00, Hilmar Lapp wrote:
>> Since load_seqdatabase.pl will use bioperl's SeqIO parsers for parsing
>> any input file, what you're asking is whether or not there is a SeqIO
>> parser for NCBI Gene.
>>
>> The answer to that question is no, not yet. Anybody who feels  
>> motivated
>> is welcome to give it a try ... Since I'll need it, I'll write the
>> parser if nobody else does within the next 3 months, but I'm not going
>> to promise when exactly this will happen.
>>
>> 	-hilmar
>>
>> On Monday, December 13, 2004, at 08:03  AM, Law, Annie wrote:
>>
>>> Hi,
>>>
>>> I was wondering with regards to bioperl-db the scripts and schema and
>>> load_seqdatabase.pl has there been preparation for integration of
>>> Entrez
>>> gene information when locuslink is phased out?  Or if it has already
>>> been
>>> changed could somebody point
>>> me to the documentation or changed code?
>>>
>>> Thanks,
>>> Annie.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
> -- 
> Peter N. Robinson
> peter.robinson at t-online.de
> peter.robinson at charite.de
> http://www.charite.de/ch/medgen/robinson/
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------




More information about the Bioperl-l mailing list