[Bioperl-l] Getting sequences by base pair locations

Kevin Brown Kevin.M.Brown at asu.edu
Tue Aug 1 22:43:00 UTC 2006


Perl Mechanize is a great way to submit web forms repeatedly.  I do it
for things like MHC epitope prediction sites as well as a way to grab
things like journal articles matching certain keywords.

http://www.perl.com/pub/a/2003/01/22/mechanize.html
http://search.cpan.org/dist/WWW-Mechanize/lib/WWW/Mechanize.pm 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of 
> Cook, Malcolm
> Sent: Tuesday, August 01, 2006 8:12 AM
> To: Yuval Itan; bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Getting sequences by base pair locations
> 
> Yuval,
> 
> Glad to help.  Given that you are not running blat suite 
> locally, but at
> ucsc, you should try this approach:
> 
> upload/paste your blat results (in blat's native output 
> format, psl) as
> a custom track in the genome browser, named, say, myhumanhits
> (i.e. just give the blat results a new first line like: `track
> name="myhumanhits" description="myhumanhits from my favorite human
> genes" visibility=2`)
> then goto the table browser and configure it 
> 	group = 'custom tracks'
> 	track = 'myhumanhits'
> 	retion = genome
> 	output format = sequence
> 	output file = myhumanhits.fasta
> 
> submit it
> 
> When prompted, Save the myhumanhits.fasta to your computer and take it
> from there.
> 
> I'm not sure how many hits this will work for, but i just did 
> this on a
> small track and it works just fine.  Only problem, the first 
> word in the
> fasta defline is always the same for all sequences.  You'll have to
> 'uniqify' these names somehow probably (depedning of course on your
> application).
> 
> Let us know & Good luck & ask for good email support on ucsc genome
> browser subscribe to
> http://www.soe.ucsc.edu/mailman/listinfo/genome-announce
> 
> Malcolm Cook
> Database Applications Manager, Bioinformatics
> Stowers Institute for Medical Research 
>  
> 
> >-----Original Message-----
> >From: bioperl-l-bounces at lists.open-bio.org 
> >[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Yuval Itan
> >Sent: Tuesday, August 01, 2006 8:36 AM
> >To: bioperl-l at lists.open-bio.org
> >Subject: Re: [Bioperl-l] Getting sequences by base pair locations
> >
> >Thank you all for all the helpful answers!
> >Malcolm- I've used the UCSC server to do the BLAT search (because I 
> >couldn't run it locally due to memory problems)- so I could 
> >not get the 
> >chimp sequences in a convenient way. I have the results also in a 
> >normal Blat output including all usual fields: chromosome number etc.
> >Wade- thanks a lot for your offer, that would be great. The chimp 
> >genome is just one large fasta format file.
> >Cheers,
> >Yuval
> >On 28 Jul 2006, at 14:30, Sean Davis wrote:
> >
> >> Yuval Itan wrote:
> >>> Hello all,
> >>> I was BLATing a few hundred human genes against the chimp 
> >genome, and 
> >>> kept the best chimp hits for every human gene.
> >>> I have the base pair start and end location for every chimp 
> >hit, and 
> >>> I need to get the sequence for each of these chimp hits. 
> Here is an 
> >>> example for a few chimp hits bp locations:
> >>> Start End*
> >>> *142854 144504
> >>> 154479 155198
> >>> 153066 167370
> >>> 163146 163559
> >>> I have one chimp genome file (about 3GB) including all 
> chromosomes, 
> >>> but I could also get one file per chromosome if that would make 
> >>> things easier. Does anyone have a script or a link for an 
> interface 
> >>> that can do the job?
> >
> >_______________________________________________
> >Bioperl-l mailing list
> >Bioperl-l at lists.open-bio.org
> >http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 




More information about the Bioperl-l mailing list