<div dir="ltr">I am not aware of any formal information about either function or localization within PDB files.</div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, May 11, 2017 at 3:55 PM, Ahmad Abdelzaher <span dir="ltr"><<a href="mailto:underoath006@gmail.com" target="_blank">underoath006@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Sorry for the misunderstanding, but I'm mining the structures<br>
themselves. To be more precise, I'm doing structure prediction<br>
analysis. I'm not doing text mining of the headers themselves. I'm not<br>
sure which classification is used in the PDB, how consistent it<br>
is,...etc. I would be interested in any information regarding the<br>
function, or localization,..etc. If you think I'm not quite sure what<br>
I want, you'd be correct. Ultimately I will be clustering the<br>
structures based on that classification.<br>
<div class="HOEnZb"><div class="h5"><br>
On Thu, May 11, 2017 at 8:28 PM, Lenna Peterson<br>
<<a href="mailto:lenna.peterson@gmail.com">lenna.peterson@gmail.com</a>> wrote:<br>
> Well, your original question did ask how to mine the PDB header with Python.<br>
><br>
> "protein classification" is not a specific term (do you mean organism?<br>
> function? fold? etc.) - is this something that appears in the PDB header? If<br>
> so, what PDB header field is it in?<br>
><br>
> Lenna<br>
><br>
><br>
> On Thu, May 11, 2017 at 1:49 PM, Ahmad Abdelzaher <<a href="mailto:underoath006@gmail.com">underoath006@gmail.com</a>><br>
> wrote:<br>
>><br>
>> I'm not trying to mine the actual header. I would definitely be<br>
>> interested in an option that retrieves the protein classification<br>
>> without having to write any additional code. Does such option exist?<br>
>><br>
>> Regards.<br>
>><br>
>> On Thu, May 11, 2017 at 7:06 AM, João Rodrigues<br>
>> <<a href="mailto:j.p.g.l.m.rodrigues@gmail.com">j.p.g.l.m.rodrigues@gmail.com</a><wbr>> wrote:<br>
>> > You can do *some* mining. Look at parse_pdb_header.<br>
>> ><br>
>> > 2017-05-10 18:58 GMT-07:00 Ahmad Abdelzaher <<a href="mailto:underoath006@gmail.com">underoath006@gmail.com</a>>:<br>
>> >><br>
>> >> Hey guys,<br>
>> >><br>
>> >> Unfortunately I read this in the FAQ page:<br>
>> >><br>
>> >> " If you are interested in data mining the PDB header, you might want<br>
>> >> to look elsewhere because there is only limited support for this."<br>
>> >><br>
>> >> So if I can't do it with biopython, what other alternatives do I have?<br>
>> >> I'm doing some PDB mining and I'm interested to retrieve the<br>
>> >> classification of the structure, to do some clustering analysis later.<br>
>> >><br>
>> >> Cheers.<br>
>> >> ______________________________<wbr>_________________<br>
>> >> Biopython mailing list - <a href="mailto:Biopython@mailman.open-bio.org">Biopython@mailman.open-bio.org</a><br>
>> >> <a href="http://mailman.open-bio.org/mailman/listinfo/biopython" rel="noreferrer" target="_blank">http://mailman.open-bio.org/<wbr>mailman/listinfo/biopython</a><br>
>> ><br>
>> ><br>
>><br>
>> ______________________________<wbr>_________________<br>
>> Biopython mailing list - <a href="mailto:Biopython@mailman.open-bio.org">Biopython@mailman.open-bio.org</a><br>
>> <a href="http://mailman.open-bio.org/mailman/listinfo/biopython" rel="noreferrer" target="_blank">http://mailman.open-bio.org/<wbr>mailman/listinfo/biopython</a><br>
><br>
><br>
</div></div></blockquote></div><br></div>