[Bioperl-l] Re: Bio::DB::WebSeqDB and Bio::DB::GenBank
Jason Stajich
jason@chg.mc.duke.edu
Tue, 12 Dec 2000 11:50:36 -0500 (EST)
On Mon, 11 Dec 2000, Francis Ouellette wrote:
> It should be noted that ASN.1 is a much richer format than GB/EMBL,
> and it can hold many types of anotations not present in the GB/EMBL
> format ... for example, things like 1) alignments or 2) quality of
> base call (from Ace/phred output).
>
> Here we store all of our data in binary asn.1 in house
> (saves space) and can then write out anything to what ever format ...
> (typically GB or FASTA, but we can invent our formats as well, like we
> are working on for SNPs, who also come into our system in ASN.1)
>
> There is obviously a cost at doing this (you need to work with the
> ncbi toolkit is the major one), but you gain from inheriting 12 years
> of code developed by pretty good programmers (like using bioperl I
> guess :-)
>
> There are converters out there (asn<->xml) and one need not dwelve
> into asn.1 world if yu don't want to, but understanding it, and
> working with it will give you access to a richer data format and
> richer data model ...
>
Francis - thanks for the insight.
I think we should try in earnest to add functionality for reading/writing
NCBI XML and/or ASN1.1 in bioperl. There are some obvious advantages and
we will be able to provide a useful platform for people with ASN1.1
databases as well as cleaner data retrieval from GenBank, etc. But I
think it will have to be post 0.7 since it represents a fair amount of
work. I volunteer for investigating feasibilty once we have 0.7 out the
door.
>
> f.
>
> --
> | B.F. Francis Ouellette Tel: (604) 875-3815 |
> | Director, Bioinformatics Core Facility Fax: (425) 740-6978 |
> | CMMT, UBC, Canada http://www.cmmt.ubc.ca |
> | francis@cmmt.ubc.ca http://www.bioinformatics.ca |
>
>
>
>
Jason Stajich
jason@chg.mc.duke.edu
Center for Human Genetics
Duke University Medical Center
http://www.chg.duke.edu/