<div dir="ltr">Hi Peter,<div><br></div><div>I ended up doing it like this:</div><div>





<p class="inbox-inbox-p1"><span class="inbox-inbox-s1">import</span> urllib.request</p>
<p class="inbox-inbox-p1"><span class="inbox-inbox-s1">import</span> operator</p>
<p class="inbox-inbox-p2"><span class="inbox-inbox-s2">url = </span>'<a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.14_GRCh37.p13/GCA_000001405.14_GRCh37.p13_assembly_report.txt">ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.14_GRCh37.p13/GCA_000001405.14_GRCh37.p13_assembly_report.txt</a>'</p>
<p class="inbox-inbox-p1"><span class="inbox-inbox-s3"><b>with</b></span> urllib.request.urlopen(url) <span class="inbox-inbox-s3"><b>as</b></span> response:</p>
<p class="inbox-inbox-p1"><span class="inbox-inbox-Apple-converted-space">    </span>d_lengths = {}</p>
<p class="inbox-inbox-p1"><span class="inbox-inbox-Apple-converted-space">    </span><span class="inbox-inbox-s3"><b>for</b></span> l <span class="inbox-inbox-s3"><b>in</b></span> <span class="inbox-inbox-s4">filter</span>(</p>
<p class="inbox-inbox-p3"><span class="inbox-inbox-s2"><span class="inbox-inbox-Apple-converted-space">        </span></span>## Skip if Sequence-Role is not assembled-molecule.</p>
<p class="inbox-inbox-p2"><span class="inbox-inbox-s2"><span class="inbox-inbox-Apple-converted-space">        </span></span><span class="inbox-inbox-s3"><b>lambda</b></span><span class="inbox-inbox-s2"> x: x[</span>1<span class="inbox-inbox-s2">] == </span>'assembled-molecule'<span class="inbox-inbox-s2">,</span></p>
<p class="inbox-inbox-p3"><span class="inbox-inbox-s2"><span class="inbox-inbox-Apple-converted-space">        </span></span>## Split line string into a list.</p>
<p class="inbox-inbox-p1"><span class="inbox-inbox-Apple-converted-space">        </span><span class="inbox-inbox-s4">map</span>(operator.methodcaller(<span class="inbox-inbox-s5">'split'</span>, <span class="inbox-inbox-s5">'</span><span class="inbox-inbox-s6">\t</span><span class="inbox-inbox-s5">'</span>),</p>
<p class="inbox-inbox-p3"><span class="inbox-inbox-s2"><span class="inbox-inbox-Apple-converted-space">            </span></span>## Skip header/comment lines and strip newline characters.</p>
<p class="inbox-inbox-p1"><span class="inbox-inbox-Apple-converted-space">            </span><span class="inbox-inbox-s4">map</span>(<span class="inbox-inbox-s4">str</span>.rstrip, <span class="inbox-inbox-s4">filter</span>(</p>
<p class="inbox-inbox-p1"><span class="inbox-inbox-Apple-converted-space">            </span><span class="inbox-inbox-s3"><b>lambda</b></span> x: x[<span class="inbox-inbox-s5">0</span>] != <span class="inbox-inbox-s5">'#'</span>,</p>
<p class="inbox-inbox-p3"><span class="inbox-inbox-s2"><span class="inbox-inbox-Apple-converted-space">            </span></span>## Decode with utf-8 from bytes to string.</p>
<p class="inbox-inbox-p1"><span class="inbox-inbox-Apple-converted-space">            </span><span class="inbox-inbox-s4">map</span>(<span class="inbox-inbox-s4">bytes</span>.decode, response))))):</p>
<p class="inbox-inbox-p1"><span class="inbox-inbox-Apple-converted-space">        </span>chrom = l[<span class="inbox-inbox-s5">0</span>]</p>
<p class="inbox-inbox-p1"><span class="inbox-inbox-Apple-converted-space">        </span>length = l[<span class="inbox-inbox-s5">9</span>]</p>
<p class="inbox-inbox-p1"><span class="inbox-inbox-Apple-converted-space">        </span>d_lengths[chrom] = length</p></div></div><br><div class="gmail_quote"><div dir="ltr">On Wed, 22 Mar 2017 at 16:59 Peter Cock <<a href="mailto:p.j.a.cock@googlemail.com">p.j.a.cock@googlemail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hmm.<br class="gmail_msg">
<br class="gmail_msg">
Using the NCBI Entrez API, you could certainly download these as<br class="gmail_msg">
FASTA or GenBank files, either of which would give you the length.<br class="gmail_msg">
But I don't think that offers GFF files.<br class="gmail_msg">
<br class="gmail_msg">
I don't work on model organisms, but I'd suggest ENSEMBL might<br class="gmail_msg">
be a good bet - but we don't yet have a Biopython module for that?<br class="gmail_msg">
<br class="gmail_msg">
<a href="http://www.ensembl.org/" rel="noreferrer" class="gmail_msg" target="_blank">http://www.ensembl.org/</a><br class="gmail_msg">
<a href="https://github.com/biopython/biopython/issues/512" rel="noreferrer" class="gmail_msg" target="_blank">https://github.com/biopython/biopython/issues/512</a><br class="gmail_msg">
<br class="gmail_msg">
It might be worth looking at bioservices for this?<br class="gmail_msg">
<br class="gmail_msg">
<a href="https://github.com/cokelaer/bioservices" rel="noreferrer" class="gmail_msg" target="_blank">https://github.com/cokelaer/bioservices</a><br class="gmail_msg">
<br class="gmail_msg">
Peter<br class="gmail_msg">
<br class="gmail_msg">
On Wed, Mar 22, 2017 at 4:24 PM, Tommy Carstensen<br class="gmail_msg">
<<a href="mailto:tommy.carstensen@gmail.com" class="gmail_msg" target="_blank">tommy.carstensen@gmail.com</a>> wrote:<br class="gmail_msg">
> Is it possible to get the chromosome lengths in maize (Zea mays), zebra fish<br class="gmail_msg">
> and humans with Biopython without having the relevant gff files? How would I<br class="gmail_msg">
> go about doing that? Basically I just want to be able to fetch the gff by<br class="gmail_msg">
> typing in species='homo sapiens' and build=37 or something like that without<br class="gmail_msg">
> having to worry about URLs.<br class="gmail_msg">
><br class="gmail_msg">
> Could Biopython also give me the position of the pseudoautosomal regions on<br class="gmail_msg">
> the X chromosome in Homo sapiens?<br class="gmail_msg">
><br class="gmail_msg">
> Thanks,<br class="gmail_msg">
> Tommy<br class="gmail_msg">
><br class="gmail_msg">
> _______________________________________________<br class="gmail_msg">
> Biopython mailing list  -  <a href="mailto:Biopython@mailman.open-bio.org" class="gmail_msg" target="_blank">Biopython@mailman.open-bio.org</a><br class="gmail_msg">
> <a href="http://mailman.open-bio.org/mailman/listinfo/biopython" rel="noreferrer" class="gmail_msg" target="_blank">http://mailman.open-bio.org/mailman/listinfo/biopython</a><br class="gmail_msg">
</blockquote></div>