[Biopython] Tutorial Question 7.4 alignment.title
Ara Kooser
akooser at unm.edu
Fri Oct 8 16:01:58 UTC 2010
Peter,
Thank you for those suggestions. I hadn't thought of using BLAST+.
I will check that out this weekend.
Regards,
Ara
On Oct 8, 2010, at 9:56 AM, Peter wrote:
> On Fri, Oct 8, 2010 at 4:45 PM, Ara Kooser <akooser at unm.edu> wrote:
>> Peter,
>>
>> Thanks for your reply. I started to fiddle around with parsing the
>> string
>> last night but haven't made much progress.
>>
>> At the moment the output looks like this:
>>
>> ****Alignment****
>> sequence: gi|302529614|ref|ZP_07281956.1| predicted protein
>> [Streptomyces
>> sp. AA4] >gi|302438509|gb|EFL10325.1| predicted protein
>> [Streptomyces sp.
>> AA4]
>> e value: 1.89229e-46
>> length: 1109
>> start: 7
>> end: 414
>>
>> So what I want from the sequence string is the following:
>> [Streptomyces sp. AA4]
>> ZP_07281956.1
>>
>> printed out as separated lines like the rest of the output.
>
> You could do this with regular expressions (import re), or some simple
> python searching for the square brackets etc.
>
>> After that is figured out I want to put all the information in
>> columns so it
>> can be read into a spreadsheet in OO so that it looks like this:
>> Name Locus # E_value Length Start End
>
> It would be much simpler to ask BLAST to give you tabular ouput.
> If you are using BLAST+ you can even specify which columns you
> want (although this won't pull out the organism name for you).
>
> Peter
More information about the Biopython
mailing list