[Bioperl-l] Add a kind of hspsepQmax/hspsepSmax (like WuBlast has)in Bio::Search::Tiling::MapTiling
Mark A. Jensen
maj at fortinbras.us
Mon Apr 26 13:17:51 UTC 2010
Hi Fred,
I'll tell you how you can write a kludge; maybe you can expand it into
a more general method.
For your tblastn data, get the coverage map array
@map = $tiling->coverage_map('hit', 'p0')
Each element of the map is a ref to a pair [$int, $hsp], where $int is
itself a reference to a two-elt array containing the coordinates of the
hsp in context and $hsp is the hsp object itself. You can use these to
filter the @map array.
For your example, you can just get rid of the first @map elt:
shift @map;
Replace the internal map for this type and context, so that
the methods work on the modified map:
$tiling->{'coverage_map_hit_p0'} = \@map;
Then $tiling->identities('hit', 'exact', 'p0'), etc. give you the
new values.
HTH-
MAJ
----- Original Message -----
From: <Frederic.SAPET at biogemma.com>
To: <bioperl-l at bioperl.org>
Sent: Friday, April 23, 2010 11:16 AM
Subject: [Bioperl-l] Add a kind of hspsepQmax/hspsepSmax (like WuBlast has)in
Bio::Search::Tiling::MapTiling
> Hello
>
> Based on bp_search2gff.pl script and Bio::Search::Tiling::MapTiling
> documentation (http://www.bioperl.org/wiki/HOWTO:Tiling), I'm trying to
> write a generic blast to gff3 parser.
>
> My idea is to filter hits on frac_aligned and percent_identity values.
>
> I'm facing a problem with a BlastX result and the corresponding TBlastN.
>
> Please find my script and the two example files attached.
>
> The example is a piece of Maize Chromosome where a protein seems to be
> duplicated.
>
> When I launch the parsing of BlastX file and I want to retrieve data from
> a Query View ( >tiling.pl BlastX query), I have :
>
> Chr6:159690000-159718000 BLASTX match_set 23971 25620
> 121.6 + .
> ID=Os03g17980.2:1.1.1;alignLength=576;eValue=4.6e-137;fractionAligned=97.0530451866405;gapNumber=16;Name=Os03g17980.2;percentageIdentity=69.1552062868369
> Chr6:159690000-159718000 BLASTX match_part 23971 24186 331
> + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 120 191
> Chr6:159690000-159718000 BLASTX match_part 24820 24915 100
> + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 291 322
> Chr6:159690000-159718000 BLASTX match_part 25195 25308 89
> + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 358 395
> Chr6:159690000-159718000 BLASTX match_part 25390 25620 192
> + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 395 472
>
> Chr6:159690000-159718000 BLASTX match_set 918 2567 121.6
> + .
> ID=Os03g17980.2:1.2.1;alignLength=576;eValue=4.6e-137;fractionAligned=97.0530451866405;gapNumber=16;Name=Os03g17980.2;percentageIdentity=69.1552062868369
> Chr6:159690000-159718000 BLASTX match_part 918 1148 192
> - 0 Parent=Os03g17980.2:1.2.1;Target=Os03g17980.2 395 472
> Chr6:159690000-159718000 BLASTX match_part 1230 1343 89
> - 0 Parent=Os03g17980.2:1.2.1;Target=Os03g17980.2 358 395
> Chr6:159690000-159718000 BLASTX match_part 1623 1718 100
> - 0 Parent=Os03g17980.2:1.2.1;Target=Os03g17980.2 291 322
> Chr6:159690000-159718000 BLASTX match_part 2352 2567 331
> - 0 Parent=Os03g17980.2:1.2.1;Target=Os03g17980.2 120 191
>
> this is perfect, I retrieve two nice hits, with perfectly tiled HSP.
>
> But, with the TBlastN report (using a Hit View : >tiling.pl TBlastN hit),
> I have :
> Chr6:159690000-159718000 TBLASTN match_set 7666 25620
> 121.6 + .
> ID=Os03g17980.2:1.1.1;alignLength=303;eValue=4.9e-137;fractionAligned=98.8212180746562;gapNumber=18;Name=Os03g17980.2;percentageIdentity=66.0052390307793
> Chr6:159690000-159718000 TBLASTN match_part 7666 7917 44
> + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 332 416
> Chr6:159690000-159718000 TBLASTN match_part 23971 24186 331
> + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 120 191
> Chr6:159690000-159718000 TBLASTN match_part 24820 24915 100
> + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 291 322
> Chr6:159690000-159718000 TBLASTN match_part 25195 25308 89
> + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 358 395
> Chr6:159690000-159718000 TBLASTN match_part 25390 25620 192
> + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 395 472
>
> I lose one of my hit, because another HSP is tiled to my hit, so I trash
> it when I filter the context using identitie values (line 42 to 54 of my
> script).
> This HSP is far away in 5', so I would like to know if it could be
> possible to add (or help me to develop this) a sort of
> hspsepQmax/hspsepSmax (maximum allowed separation along the query(or
> subject) sequence between two HSPs ) as a new parameter during the tiling
> phase ?
>
>
>
> Thank you.
>
> Fred
>
>
>
--------------------------------------------------------------------------------
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list