[Bioperl-l] retrieval of sequences from remote DB... Bio::DB::RefSeq?

Mikaela Ilinca Gabrielli MILG@lundbeck.com
Thu, 7 Nov 2002 15:16:50 +0100


Hello Bioperl-people,

I have I problem and perhaps an idea for an improvement to a BioPerl module?


I'd like to be able to retrieve not only a protein sequence or a DNA
sequence from RefSeq but both at the same time!
For example if I have the protein accession number I'd like to retrieve (if
existing) the corresponding DNA sequence. I haven't found any module that
does this, if there is one then please enlighten me about it! 

But if there isn't, shouldn't it be rather easy to create one? In the RefSeq
DB (available through LocusLink from NCBI) each entry is linked to both the
protein ( amino acid ) sequence and the nucleic ( cDNA or mRNA ) sequence
already! Wouldn't it be smooth to create an add-on to the Bio::DB::RefSeq
for retrievement of both upon request?

I would really appreciate ANY ideas for resolving this issue. The only way I
can think of right now would be to retrieve one sequence ( protein or DNA) ,
blast for it's complementary DNA/protein sequence and parse the results to
get the one I'm seeking. But it could get messy and I'd like to know weather
this is the only way?

Thanks in advance for tips and ideas!

/Mikaela