[Biopython-dev] Fwd: SearchIO HSP indexing

Colin Archer colin.aibn at gmail.com
Sat Feb 9 15:19:26 UTC 2013


> Hi Peter,

>  >              Thanks for getting back to me so quickly.
> >
>
> Thank you - the main reason for including SearchIO in Biopython 1.61
> as 'experimental code' is to get wider testing and feedback (hopefully
> an approach that will work well and we can use this more in future for
> other new code).
>
>
I've been using it for a couple months now and i definitely prefer it over
the existing parser.


>  > I'm curious about the benefits of having these values in Python string
> > slicing format? I haven't come across this very often, I'm used to seeing
> > values systematically zero or one-based.
>
> Once you're used to Python slicing it becomes very natural.
>
>
> Would it be easier to keep the range variables hit_range and hit_range_all
> > in slicing format and the start and end variables in sequence position
> > format so that they represent the actual BLAST results?
>
> One reason for this is to be consistent across all the formats supported
> in SearchIO, and since Biopython is a Python library following Python
> norms seems most natural.
>
> > I had a look at some of the code and I can't see the slicing format
> > mentioned anywhere (Hsp.py, Hit.py, or blast_xml.py). It would probably
> be
> > helpful to explain the values in Hsp.py as a ** mark on hsp_start,
> hsp_end,
> > query_start, and query_end so that if people are interested they can
> have a
> > look at the files and see what they mean.
> >
> > Thanks
> > Colin
>
> OK, so some clarification with examples in the docstrings is needed.
> How about the Tutorial chapter?
>
> I would definitely add comments to the Hsp.py file and if there is a
tutorial that people use, I would also update that as that would be the
first place most people would look.

I was wondering if there was any code in SearchIO to align high-scoring
segment pairs against the same hit? I see the fragmentation code but that
seems specific to BLAT results and when I look at the HSPFragments in the
QueryResult object it does not seem to combine multiple HSPs against the
same hit even if they are not overlapping.

Thanks
Colin



More information about the Biopython-dev mailing list