<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Title" content="">
<meta name="Keywords" content="">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:"Courier New";
        panose-1:2 7 3 9 2 2 5 2 4 4;}
@font-face
        {font-family:Wingdings;
        panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p
        {mso-style-priority:99;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:12.0pt;
        font-family:"Times New Roman";}
pre
        {mso-style-priority:99;
        mso-style-link:"HTML Preformatted Char";
        margin:0in;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New";}
tt
        {mso-style-priority:99;
        font-family:"Courier New";}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0in;
        margin-right:0in;
        margin-bottom:0in;
        margin-left:.5in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman";}
span.HTMLPreformattedChar
        {mso-style-name:"HTML Preformatted Char";
        mso-style-priority:99;
        mso-style-link:"HTML Preformatted";
        font-family:Courier;}
span.EmailStyle21
        {mso-style-type:personal-reply;
        font-family:Calibri;
        color:windowtext;}
span.msoIns
        {mso-style-type:export-only;
        mso-style-name:"";
        text-decoration:underline;
        color:teal;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:111747589;
        mso-list-type:hybrid;
        mso-list-template-ids:555136874 2090658488 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
        {mso-level-start-at:0;
        mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Wingdings;
        mso-fareast-font-family:Calibri;
        mso-bidi-font-family:"Times New Roman";}
@list l0:level2
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:"Courier New";}
@list l0:level3
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Wingdings;}
@list l0:level4
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Symbol;}
@list l0:level5
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:"Courier New";}
@list l0:level6
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Wingdings;}
@list l0:level7
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Symbol;}
@list l0:level8
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:"Courier New";}
@list l0:level9
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Wingdings;}
ol
        {margin-bottom:0in;}
ul
        {margin-bottom:0in;}
--></style>
</head>
<body bgcolor="white" lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">We would probably need a list of IDs, but this has happened before a few times.  In some cases it’s an issue of line ending mismatches, which can be normalized using a tool like dos2unix. 
 However if you have IDs that could be evaluated as False the issue is trickier and not so easy to fix, primarily because the returned value is stringified to the display ID (which is one reason I hate object stringification).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">For example, the following would likely short-circuit without showing sequence IDs, as having a seq ID of ‘0’ (note this does not include the description, which is separate) will evaluate
 as False and kill the while loop:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">>0 desc1<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">ATATATGTGC<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">>1 desc2<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">CGCGCCGCGC<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">The issue, the problems with a fix, and a workaround are described here:</span>
<span style="font-size:11.0pt;font-family:Calibri">https://github.com/bioperl/bioperl-live/issues/170<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">chris<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-family:Calibri;color:black">From: </span>
</b><span style="font-family:Calibri;color:black">Bioperl-l <bioperl-l-bounces+cjfields=illinois.edu@mailman.open-bio.org> on behalf of Helene RIMBERT <helene.rimbert@inra.fr><br>
<b>Date: </b>Monday, November 14, 2016 at 10:16 AM<br>
<b>To: </b>"bioperl-l@mailman.open-bio.org" <bioperl-l@mailman.open-bio.org><br>
<b>Subject: </b>[Bioperl-l] Bio::DB::Fasta problem: unable to fetch all sequences via get_PrimarySeq_stream<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><tt><span style="font-size:10.0pt">Dear BioPerl developers,</span></tt><span style="font-size:10.0pt;font-family:"Courier New""><br>
<br>
<tt>I come with a question regarding the get_PrimarySeq_stream !</tt><br>
<br>
<tt>I am using the Bio::DB:Fasta module to access my fasta sequences and i am facing some problem with the get_PrimarySeq_stream().</tt></span><br>
<tt><span style="font-size:10.0pt">When i check the content of the db object, all the sequences are indexed (i mean that i can see all the sequences ids in the offsets hash).</span></tt><span style="font-size:10.0pt;font-family:"Courier New""><br>
<br>
<tt>I then use the get_PrimarySeq_stream to loop over all my sequences, but only 1 sequence is retrieved from the stream object.</tt><br>
<tt>I tried to look for some explanations, and the only thing i could find is that it seems that my seq_ids are considered as undef. during the while($dbstream->next_seq()) statement when reaching</tt><br>
<tt>IndexedBase.pm line 1116</tt><br>
<br>
<tt>I tried to loop over all sequence ids using my @seq_ids = $self->{fastaObj}->get_all_primary_ids; and it works very well.</tt><br>
<br>
<tt>I don't understand why the stream object does not retrieve all the sequences whereas get_all_primary_ids does!</tt><br>
<tt>Is there something wrong with my input FASTA (my ids are very long...) or am i missing something?</tt><br>
<br>
<tt>I am really interested in finding out why i am not able to use get_PrimarySeq_stream !</tt><br>
<br>
<tt>Many thanks in advance :)</tt><br>
<br>
<tt>Regards,</tt><br>
<br>
<tt>Helene</tt><br>
<br>
<tt>#----------------------------------</tt><br>
<tt># here is the part of code that causes problem:</tt><br>
<tt># initialize db::fasta object</tt><br>
<tt>$self->{fastaObj} =  Bio::DB::Fasta->new("test2.fna", -reindex => 1);</tt><br>
<br>
<tt># create stream object</tt><br>
<tt>my $seq_stream = $self->{fastaObj}->get_PrimarySeq_stream();</tt><br>
<tt>$self->{nbSeqFetchedInStream}=0;</tt><br>
<br>
<tt># loop over all seq in BioDBFasta obj using stream obj.</tt><br>
<tt>while ($self->{seq} = $seq_stream->next_seq()){</tt><br>
<tt>#foreach my $seq_id (@seq_ids){</tt><br>
<tt>    #$self->{seq} = $self->{fastaObj}->get_Seq_by_id($seq_id); # to use with foreach loop</tt><br>
<br>
<tt>    print (" New sequence: ", Dumper $self->{seq});</tt><br>
<tt>    $self->{nbSeqFetchedInStream}++;</tt><br>
<tt>}</tt><br>
<tt>print (" Fetched sequences in _PrimarySeq_stream: $self->{nbSeqFetchedInStream}");</tt><br>
<tt>#----------------------------------</tt><br>
<br>
<br>
<br>
<br>
</span><o:p></o:p></p>
<div>
<p class="MsoNormal">-- <o:p></o:p></p>
<p><b>--> Nouvelle adresse e-mail: <a href="mailto:helene.rimbert@inra.fr">helene.rimbert@inra.fr</a> <--</b><o:p></o:p></p>
<pre>Hélène RIMBERT<o:p></o:p></pre>
<pre>Bioinformatic Engineer<o:p></o:p></pre>
<pre><a href="mailto:helene.rimbert@inra.fr">helene.rimbert@inra.fr</a><o:p></o:p></pre>
<pre>UMR 1095 INRA/UBP – Site de Crouel<o:p></o:p></pre>
<pre>Tèl. : +33 (0)4 73 62 43 49<o:p></o:p></pre>
<pre>5 chemin de beaulieu<o:p></o:p></pre>
<pre>63039 Clermont-Ferrand Cedex 2<o:p></o:p></pre>
<pre>France<o:p></o:p></pre>
<pre><a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__www6.ara.inra.fr_umr1095-5Feng_&d=DQMDaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=fbHa8Njtvh9VmSnzJxiEUTW9NWDwMMwQAzhgZDO41GQ&m=iAuK-qAsrrjM_h3E9YA-ujqtTSn1yoLk7cNZJ6SUYjE&s=5CzTn2cwr47V7x_FBW4PWVEZ_mB6nyuGjo1LgBYcG7U&e=">https://www6.ara.inra.fr/umr1095_eng/</a><o:p></o:p></pre>
</div>
</div>
</body>
</html>