[Bioperl-l] How can I pull out all instances of a motif from a genome sequence and output them as a BED file?
Chris Fields
cjfields at uiuc.edu
Thu Jun 14 01:58:37 UTC 2007
This is answered in the FAQ (sorry if the URL wraps, but we don't
like tinyurls):
http://www.bioperl.org/wiki/
FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_.
22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F
chris
On Jun 13, 2007, at 7:20 PM, John Cumbers wrote:
> Hello,
>
> I have a simple problem, I'm trying to search a genome sequence for
> a motif,
> I then want to output a BED file to display all the locations of
> this motif
> on the UCSC Genome Browser. I could not find a script to do this,
> so I
> started to write my own. I'm new to perl and my code below was my
> attempt
> to read the sequence string and output the index bp of the start of
> each
> motif. With this I could build the BED file myself, which requires
> start
> and finish base pairs.
>
> For the first motif I can output the start index, but when I try
> and read
> the next one off the sequence it does not work. Instead I just get an
> output of a list of 1's. I realise that this is more a request for
> some
> simple perl help, but any help much appreciated.
>
> Best wishes,
> John
>
>
> $seq_object = read_sequence
> ("Drosophila.Chr3.test.AE014296.fasta"); #turn
> my FASTA file into a seq object.
> $sequence_as_a_string = $seq_object->seq(); #turn it into a string
> # search $sequence_as_a_string string for motif AAA as example
> # if found, return the index that it is found at
>
> while ($sequence_as_a_string =~ m/AAA/g) {
> print "Found '$&'. Next attempt at character " .
> pos($sequence_as_a_string)+1 . "\n";
> }
>
>
>
> --
> John Cumbers, Graduate Student
> Biology and Medicine
> Brown University, Box G-W
> Providence, Rhode Island, 02912, USA
> Tel USA: +1 401 523 8190, Fax: +1 401 863-2166
> UK to USA: 0207 617 7824
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list