[Bioperl-l] Bio::Tools::Blast::HTML questions
Zhao, David [PRI]
DZhao1@prius.jnj.com
Mon, 18 Sep 2000 13:20:40 -0400
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
------_=_NextPart_001_01C02194.C49AF560
Content-Type: text/plain;
charset="iso-8859-1"
First of all, thank you all very much for your reply.
I've looked at the codes, it was "\w+" in my HTML.pm, but I didn't realize I
had 0.6.0, instead of 0.6.1.
I'll try to write a better bug report next time.
Hope this won't affect my future bug reports.
Thanks again
David
> -----Original Message-----
> From: Andrew Dalke [SMTP:dalke@acm.org]
> Sent: Friday, September 15, 2000 10:08 PM
> To: Zhao, David [PRI]
> Cc: 'Bio-Perl'
> Subject: Re: [Bioperl-l] Bio::Tools::Blast::HTML questions
>
> Zhao, David [PRI] <DZhao1@prius.jnj.com> said:
> > It seems that nobody has had the same problem, or you guy think
> > this is just not significant enough to be answered.
>
> Actually, there could be several other reasons. For example, almost
> the only time I get HTML formatted mail is from junk/spam mail, so
> I have an almost instinctual urge to delete those mails when I see
> them. Since your message in no way needed the extra abilities of
> HTML, you should have used ASCII instead.
>
> Second, it was hard to figure out what the problem was, if just
> given your description. A more helpful report might have been
>
> ] Hi there,
> ] It seems the HTML module doesn't recognize the genbank format in
> ] the summary table. The lines look like:
> ]
> ] dbj|AU027194.1|AU027194 Rattus norvegicus, OTSUKA clone, OT17.21...
> 52
> 4e-06
> ]
> ] When I replace the ".1" in "AU027194.1" with "AU027194" it works.
> ] Here's the full table:
> ] ...
>
> It isn't much harder to write than your original email, but it is much
> easier for someone else to understand.
>
> It would also be helpful to know what "doesn't recognize" means. Does
> it stop with an error? Is the line just ignored? Is the rest of the
> input file ignored?
>
> The better a bug report is, the more likely it will be answered. But
> it takes effort and practice to learn how to write a good report.
>
> Third, given that it's been over 24 hours, you could have messed around
> with the code yourself. A good bug report can almost guide you to where
> to
> look in the code.
>
> In this case, that's the code for parsing the summary table, most likely
> related to parsing genbank lines. From a quick perusal, the problem would
> likely be in the section:
>
> ## REGEXPS FOR SUMMARY TABLE LINES AT TOP OF REPORT (a.k.a.
> 'descriptions')
> ## (table of sequence id, description, score, P/Expect value, n)
> ##
> ## Not using bold face to highlight the sequence id's since this can throw
> off
> ## off formatting of the line when the IDs are different lengths. This
> lead
> to
> ## the scores and P/Expect values not lining up properly.
>
> ### NCBI-specific markups for description lines:
>
> # GenBank/EMBL, DDBJ hits (GenBank Format):
> s@^ ?(gb|emb|dbj)\|($Word)(\|$Word)?($Descrip)($Int
> +)($Signif)(.*)$@$1:<a
> href=
> "$DbUrl{'gb_n'}$2">$2$3</a>$4$5<A href="\#$2_A">$6</a>$7<a
> name="$2_H"></a>@o;
>
> It wasn't very hard to find this code.
>
> If you look at the definition of "Word" you'll see it is defined as
> "[\w_.]"
> so this pattern *should* match the data line you give.
>
> So in a followup email you could describe your hypothesis of the problem
> and what you've done to track it down.
>
>
> Fourth, given that the code is correct, you could check the CVS logs,
> available even to anonymous external users, and see
>
> revision 1.3.2.1
> date: 2000/05/18 20:53:31; author: sac; state: Exp; lines: +6 -4
> - The $Word and $Acc strings now include '.' to accomodate accessions with
> version number. Word also allows '_' to work with ref seq accessions.
> - Silencing warnings during _markup_report.
>
> Checking bioperl-0.6.1.tar.gz (with the file datestamp on the ftp site of
> May 19, 2000, so the day after Steve's fix) you'll see that the code
> contains the fix mentioned in the CVS log.
>
> So the answer to your statement:
> > It seems that nobody has had the same problem, or you guy think
> > this is just not significant enough to be answered.
>
> is that it has been seen, corrected, and distributed almost 4 months
> ago, so nobody has the problem. You need to update your distribution.
> Also, I'll bet that only 2 or 3 people ever saw the bug before it was
> fixed, so most of the people on the list really have not ever seen
> the problem and could not answer your email without spending non-trivial
> time digging through the back logs.
>
> This also means that if you are submitting a bug report, you do need
> to include the version number in which you found the problem.
>
> Andrew Dalke
> dalke@acm.org
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
------_=_NextPart_001_01C02194.C49AF560
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Dus-ascii">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
5.5.2651.75">
<TITLE>RE: [Bioperl-l] Bio::Tools::Blast::HTML questions</TITLE>
</HEAD>
<BODY>
<P><FONT COLOR=3D"#800000" FACE=3D"Verdana">First of all, thank you all =
very much for your reply.</FONT>
<BR><FONT COLOR=3D"#800000" FACE=3D"Verdana">I've looked at the codes, =
it was "\w+" in my HTML.pm, but I didn't realize I had 0.6.0, =
instead of 0.6.1.</FONT>
<BR><FONT COLOR=3D"#800000" FACE=3D"Verdana">I'll try to write a better =
bug report next time.</FONT>
<BR><FONT COLOR=3D"#800000" FACE=3D"Verdana">Hope this won't affect my =
future bug reports.</FONT>
<BR><FONT COLOR=3D"#800000" FACE=3D"Verdana">Thanks again</FONT>
<BR><FONT COLOR=3D"#800000" FACE=3D"Verdana">David</FONT>
</P>
<P><FONT COLOR=3D"#800000" FACE=3D"Verdana"></FONT>
<BR><FONT SIZE=3D1 FACE=3D"Arial">-----Original Message-----</FONT>
<BR><B><FONT SIZE=3D1 FACE=3D"Arial">From: </FONT></B> <FONT =
SIZE=3D1 FACE=3D"Arial">Andrew Dalke [SMTP:dalke@acm.org]</FONT>
<BR><B><FONT SIZE=3D1 FACE=3D"Arial">Sent: </FONT></B> <FONT =
SIZE=3D1 FACE=3D"Arial">Friday, September 15, 2000 10:08 PM</FONT>
<BR><B><FONT SIZE=3D1 =
FACE=3D"Arial">To: </FONT></B> <FONT SIZE=3D1 =
FACE=3D"Arial">Zhao, David [PRI]</FONT>
<BR><B><FONT SIZE=3D1 =
FACE=3D"Arial">Cc: </FONT></B> <FONT SIZE=3D1 =
FACE=3D"Arial">'Bio-Perl'</FONT>
<BR><B><FONT SIZE=3D1 =
FACE=3D"Arial">Subject: </FONT>=
</B> <FONT SIZE=3D1 FACE=3D"Arial">Re: [Bioperl-l] =
Bio::Tools::Blast::HTML questions</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">Zhao, David [PRI] =
<DZhao1@prius.jnj.com> said:</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> It seems that nobody has had the =
same problem, or you guy think</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> this is just not significant =
enough to be answered.</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">Actually, there could be several other =
reasons. For example, almost</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">the only time I get HTML formatted =
mail is from junk/spam mail, so</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">I have an almost instinctual urge to =
delete those mails when I see</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">them. Since your message in no =
way needed the extra abilities of</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">HTML, you should have used ASCII =
instead.</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">Second, it was hard to figure out what =
the problem was, if just</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">given your description. A more =
helpful report might have been</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">] Hi there,</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">] It seems the HTML =
module doesn't recognize the genbank format in</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">] the summary table. The lines =
look like:</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">]</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">] dbj|AU027194.1|AU027194 Rattus =
norvegicus, OTSUKA clone, OT17.21... 52</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">4e-06</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">]</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">] When I replace the ".1" =
in "AU027194.1" with "AU027194" it works.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">] Here's the full table:</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">] ...</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">It isn't much harder to write than =
your original email, but it is much</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">easier for someone else to =
understand.</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">It would also be helpful to know what =
"doesn't recognize" means. Does</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">it stop with an error? Is the =
line just ignored? Is the rest of the</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">input file ignored?</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">The better a bug report is, the more =
likely it will be answered. But</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">it takes effort and practice to learn =
how to write a good report.</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">Third, given that it's been over 24 =
hours, you could have messed around</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">with the code yourself. A good =
bug report can almost guide you to where to</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">look in the code.</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">In this case, that's the code for =
parsing the summary table, most likely</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">related to parsing genbank =
lines. From a quick perusal, the problem would</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">likely be in the section:</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">## REGEXPS FOR SUMMARY TABLE LINES AT =
TOP OF REPORT (a.k.a. 'descriptions')</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">## (table of sequence id, =
description, score, P/Expect value, n)</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">##</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">## Not using bold face to highlight =
the sequence id's since this can throw</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">off</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">## off formatting of the line when =
the IDs are different lengths. This lead</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">to</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">## the scores and P/Expect values not =
lining up properly.</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial"> ### NCBI-specific =
markups for description lines:</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial"> # GenBank/EMBL, DDBJ hits =
(GenBank Format):</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial"> s@^ =
?(gb|emb|dbj)\|($Word)(\|$Word)?($Descrip)($Int =
+)($Signif)(.*)$@$1:<a</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">href=3D</FONT>
<BR><FONT SIZE=3D2 =
FACE=3D"Arial">"$DbUrl{'gb_n'}$2">$2$3</a>$4$5<A =
href=3D"\#$2_A">$6</a>$7<a</FONT>
<BR><FONT SIZE=3D2 =
FACE=3D"Arial">name=3D"$2_H"></a>@o;</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">It wasn't very hard to find this =
code.</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">If you look at the definition of =
"Word" you'll see it is defined as "[\w_.]"</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">so this pattern *should* match the =
data line you give.</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">So in a followup email you could =
describe your hypothesis of the problem</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">and what you've done to track it =
down.</FONT>
</P>
<BR>
<P><FONT SIZE=3D2 FACE=3D"Arial">Fourth, given that the code is =
correct, you could check the CVS logs,</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">available even to anonymous external =
users, and see</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">revision 1.3.2.1</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">date: 2000/05/18 20:53:31; =
author: sac; state: Exp; lines: +6 -4</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">- The $Word and $Acc strings now =
include '.' to accomodate accessions with</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial"> version number. Word also =
allows '_' to work with ref seq accessions.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">- Silencing warnings during =
_markup_report.</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">Checking bioperl-0.6.1.tar.gz (with =
the file datestamp on the ftp site of</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">May 19, 2000, so the day after =
Steve's fix) you'll see that the code</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">contains the fix mentioned in the CVS =
log.</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">So the answer to your =
statement:</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> It seems that nobody has had the =
same problem, or you guy think</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> this is just not significant =
enough to be answered.</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">is that it has been seen, corrected, =
and distributed almost 4 months</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">ago, so nobody has the problem. =
You need to update your distribution.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">Also, I'll bet that only 2 or 3 =
people ever saw the bug before it was</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">fixed, so most of the people on the =
list really have not ever seen</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">the problem and could not answer your =
email without spending non-trivial</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">time digging through the back =
logs.</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">This also means that if you are =
submitting a bug report, you do need</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">to include the version number in =
which you found the problem.</FONT>
</P>
<P><FONT SIZE=3D2 =
FACE=3D"Arial"> &nb=
sp; Andrew =
Dalke</FONT>
<BR><FONT SIZE=3D2 =
FACE=3D"Arial"> &nb=
sp; =
dalke@acm.org</FONT>
</P>
<BR>
<BR>
<P><FONT SIZE=3D2 =
FACE=3D"Arial">_______________________________________________</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">Bioperl-l mailing list</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">Bioperl-l@bioperl.org</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial"><A =
HREF=3D"http://bioperl.org/mailman/listinfo/bioperl-l" =
TARGET=3D"_blank">http://bioperl.org/mailman/listinfo/bioperl-l</A></FON=
T>
</P>
</BODY>
</HTML>
------_=_NextPart_001_01C02194.C49AF560--