[Bioperl-l] need BLAT parse code
Sean Davis
sdavis2 at mail.nih.gov
Tue Nov 29 12:18:30 EST 2005
Neeti,
You could simply put the file in a text editor and take out the header.
Alternatively, type:
blat
without any arguments. You will notice that there are many options to blat,
one of which is -noHead, which suppresses the header. Or, look at only
lines that begin with a number using a regular expression.
Ultimately, I think that it will serve you well to read a perl book, though,
as parsing a text file is an important and basic topic to grasp if you want
to use perl for data analysis.
Sean
On 11/29/05 9:43 AM, "Jason Stajich" <jason.stajich at duke.edu> wrote:
>
>
> Begin forwarded message:
>
>> From: neeti somaiya <neetisomaiya at gmail.com>
>> Date: November 29, 2005 1:27:27 AM EST
>> To: Jason Stajich <jason.stajich at duke.edu>
>> Subject: Re: [Bioperl-l] need BLAT parse code
>>
>> I use the following code :
>>
>> open(FH,"output.psl");
>> while(<FH>)
>> {
>> if( /^psLayout/ )
>> {
>> for( 1..4 ) { <> }
>> }
>> my @line = split;
>> my ( $matches,$mismatches,$rep_matches,$n_count,
>> $q_num_insert,$q_base_insert,
>> $t_num_insert, $t_base_insert,
>> $strand, $q_name, $q_length, $q_start,
>> $q_end, $t_name, $t_length,$t_start, $t_end, $block_count,
>> $block_sizes, $q_starts, $t_starts
>> ) = split;
>>
>>
>> print $t_start;
>> print "\n";
>> print $t_end;
>>
>> }
>>
>> for output.psl file :
>>
>> match mis- rep. N's Q gap Q gap T gap T gap
>> strand Q Q Q Q T
>> T T T block blockSizes qStarts tStarts
>> match match count bases count
>> bases name size start end
>> name size start end count
>> ----------------------------------------------------------------------
>> ----------------------------------------------------------------------
>> -------------------
>> 27025 0 0 0 0 0 0 0
>> + query_sequence3 27025 0 27025
>> database_sequence3 57701691 132995 160020 1
>> 27025, 0, 132995,
>> ~
>>
>>
>> It gave me output :
>>
>> Q
>> Q
>>
>> 132995
>> 160020
>>
>> What is the Q? Cant I obtain the coordinates (132995, 160020) alone?
>>
>> Please let me know.
>> Thanks.
>>
>> On 11/28/05, Jason Stajich <jason.stajich at duke.edu> wrote:
>> Bio::SearchIO::psl can parse psl output.
>>
>> or more simply:
>>
>> while(<>) {
>> if( /^psLayout/ ) { # if there is a header
>> for( 1..4 ) { <> } # take next 4 lines to skip the header
>> }
>> my @line = split;
>> my ( $matches,$mismatches,$rep_matches,$n_count,
>> $q_num_insert,$q_base_insert,
>> $t_num_insert, $t_base_insert,
>> $strand, $q_name, $q_length, $q_start,
>> $q_end, $t_name, $t_length,$t_start, $t_end,
>> $block_count,
>> $block_sizes, $q_starts, $t_starts
>> ) = split;
>>
>> # query aln vals are $q_start, and $q_end values
>> # hit aln vals are $t_start, $t_end
>> }
>>
>> On Nov 28, 2005, at 8:06 AM, neeti somaiya wrote:
>>
>>> Hi,
>>>
>>> I am using BLAT in a project.I am having simple .psl output files
>>> after
>>> running BLAT of a gene sequences against full chromosomal
>>> sequences.Doesanyone have a simple BLAT parse code. I am only
>>> interested in obtaining the
>>> alignment start and end positions on the target.
>>> --
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> Duke University
>> http://www.duke.edu/~jes12
>>
>>
>>
>>
>>
>> --
>> -Neeti
>> Even my blood says, B positive
>
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list