[Bioperl-l] arabidopsis + load_seqdatabase.pl

Mon Dec 19 14:10:46 EST 2005

Hi Sean,

What I need is precisely the latest arabidopsis files (peptide as well
as dna) that has loaded the database successfully *when used with the
load_seqdatabase.pl* script.
I've tried some other files but they doesn't load all the tables correctly (
e.g. cannot distinguish between accession #, name and identifier etc and
load same data in all the 3 columns).

Please let me know if you have any queries.

Thanks,
Angshu

On 12/19/05, Sean Davis <sdavis2 at mail.nih.gov> wrote:
>
>
>
>
> On 12/19/05 12:31 PM, "Angshu Kar" <angshu96 at gmail.com> wrote:
>
> > Hi,
> >
> > I'm not fully sure whether to post this question in this community. But
> I
> > feel those who are working in plant genomics using bioperl can possibly
> > answer this. I'm trying to use load_seqdatabase.pl to load data into the
> > biosql schema.Can anyone please suggest an arabidopsis data file source
> that
> > has all the additional information (probably GENBANK format) but only
> holds
> > the CDSs?
> > I'll be obliged if anyone of you who has used such a file helps me with
> the
> > answer.
>
> Angshu,
>
> What information do you need from these files, specifically?  And what is
> your definition of a gene?  If you want to stick to Refseq genes, you can
> download from here:
>
> ftp://ftp.ncbi.nih.gov/refseq/release/plant
>
> But, the question is really, what EXACT information do you need and what
> is
> the question that you want to answer?  It is only by deciding what you
> need
> that you will know what files will suit (or not suit) your needs (and this
> may be a question that you have to decide for yourself).
>
> If you are going to be using NCBI resources (like the link above), I
> highly
> suggest looking at the NCBI handbook here before proceeding too far:
>
>
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?call=bv.View..ShowTOC&rid=handbook
> .TOC&depth=2
>
> Sean
>
>
>