[Bioperl-l] refseqs

Ewan Birney birney at ebi.ac.uk
Tue Apr 27 17:42:53 EDT 2004



On Tue, 27 Apr 2004, Phillips, Ken wrote:

>
> All -
>
> I have a list of refseq mRNA accessions for which I want to parse out
> individual exon sequence for array probe design.  I am not very familiar
> with the bioperl api, and it seems at first glance a rather daunting
> task as the exon coordinates are not present in the refseq flatfile, and
> not accessible as seqFeature objects using the exon tag.  My question
> is, how does one map refseq accessions to coordinates on a genomic
> contig?

Ken -

Have you thought about using the Ensembl API for the Exon set, or, perhaps
easier still the EnsMart dumping system which allows you to dump all the
exons as a fasta file from a set of RefSeq IDs.

Go to www.ensembl.org, click on "Data Mining/EnsMart", accept the default
dataset of Human and Ensembl Genes, uncheck the genome location filter
(this is on by default so that when people just accept the defaults they
don't always end up with the whole genome) and switch on the restrict to
Genes with RefSeq ids. Upload your file of RefSeq Ids, and then move to
the output page. Select the Sequence tab and select "exon sequences only".


In passing, you will notice all the other options you can either filter by
or output by.


If we can help in any other way, just ask on the "helpdesk" (available
from all Ensembl pages) or get back to me.


>
> Any help would be greatly appreciated.
> Thanks,
> KP
>
>
>
>
>
> Kenneth L. Phillips
> Bioinformatics Specialist
> Computational Systems Biology
> ParadigmGenetics, Inc.
> 108 T.W. Alexander Drive
> P.O. Box 14528
> Research Triangle Park, NC 27709-4528
>
> Phone: (919) 425-3000
> Direct: (919) 425-3075
> Cell: (919) 632-9865
> kphillps at paragen.com
> www.paragen.com
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>



More information about the Bioperl-l mailing list