[Bioperl-l] about gene "boundaries"
shafeeq rim
hsa_rim at yahoo.co.in
Thu Apr 29 06:57:13 UTC 2010
Hi Dimitar,
Attached is a C++ program to do your job. It is extremely faster than perl. Can do your job in less than a second even for full chromosome 1.
Steps:-
1- Download FASTA file and then remove the header. (>asdasfdasfassa)
2- Use RemoveNewline.pl program like this
RemoveNewline.pl inputfile > outputfile
3- You have to compile the C++ program using this command.
g++ ExtractSequence.cpp -o ExtractSequence
4- Then you can use the C++ program like this in linux:-
./ExtractSequence inputfilename start stop
or
In Windows
ExtractSequence inputfilename start stop
e.g:- ExtractSequence chr1.fasta 10000 20000
Hope this helps.
Thanks
Ashfaq
________________________________
From: Chris Fields <cjfields at illinois.edu>
To: Dimitar Kenanov <dimitark at bii.a-star.edu.sg>
Cc: bioperl-l at bioperl.org
Sent: Wed, 28 April, 2010 11:10:40 PM
Subject: Re: [Bioperl-l] about gene "boundaries"
By local DB, do you mean a BioPerl-based local DB? Or is it something else? This is a bit vague.
On the BioPerl side I suggest looking into Bio::DB::SeqFeature::Store for storing and querying genome information (it does exactly what you want if the proper information is loaded), or maybe the Ensembl Perl API, which can be used with a local or remote Ensembl setup. Beyond that you'll need to be more specific.
chris
On Apr 28, 2010, at 8:17 AM, Dimitar Kenanov wrote:
> Hello guys,
> i have a question about gene "boundaries". Is there some module in BioPerl which can help me extract the DNA sequence from a genomic DB (from specific chromosome). I have my human genome in a local DB and some "from-to" data sets corresponding to different chromosomes. So i want to get the DNA seqs for these from-to's. I know i can do that the normal way but if there is a way to do it with BioPerl it will be more consistent with the rest of the code.
>
> Thanks for any tips :)
>
> Cheers
> Dimitar
>
> --
> Dimitar Kenanov
> Postdoctoral research fellow
> Protein Sequence Analysis Group
> Bioinformatics Institute
> A*STAR, Singapore
> email: dimitark at bii.a-star.edu.sg
> tel: +65 6478 8514
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ExtractSequence.cpp
Type: application/octet-stream
Size: 777 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20100429/1edf4c04/attachment-0008.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RemoveNewline.pl
Type: application/octet-stream
Size: 137 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20100429/1edf4c04/attachment-0009.obj>
More information about the Bioperl-l
mailing list