[Bioperl-l] List summaries

Nathan Torkington gnat@oreilly.com
Wed, 11 Dec 2002 16:03:38 -0700


Is anyone interested in writing list summaries for the bioperl mailing
list?  It took me about an hour (including some attending to my kids
which a volunteer wouldn't be required to do :-) to come up with this
for the week of November 4-10.  Such things could be posted on
bioperl.org, and there might even be interest in a mailing list for
simply receiving the summaries of other mailing lists.

Obviously, having someone who understood the material will help the
quality of the summarizing, but it certainly doesn't take a guru (and
it's a good way to boost your knowledge).  Any takers?

Nat

Mailing list summary for bioperl-l
Subscription information: http://www.bioperl.org/MailList.shtml
Period: November 3-11, 2002

** Announcements

Gavin Sherlock announced the 4th MGED Programming Jamboree.
  http://www.dnachip.org/mged/                                                 

** Code Changes

Robson F. de Souza has contributed modules into Bio::Assembly. 

Hilmar committed the locuslink parser.

Hilmar updated load_seqdatabase.pl, adding --lookup, --noupdate,
--testonly, --format, --pipeline, --seqfilter, and --mergeobjs.

Lincoln pointed out that removing RangeI from Bio::SeqI broke some
Bio::Graphics HOWTOs and some other scripts of his.  He suggested this
change might be breaking a lot of scripts.  Ewan still thinks it's
wrong, Nat thinks it's useful.

Hilmar told Heikki that the CPAN Graph module is already a silent
dependency (the OntologyEngineI depends on it).  Heikki tried it
in Bio::Coordinate::Graph and is suspicious of its results.  Nat
Goodman tracked down a bug and will report it to the module owner.

** Unanswered Questions

Mathieu Wiepert asked why he was getting locus numbers instead of
accession numbers when parsing blast results.  He received no answer.

David Vilanova asked (and received no answer) whether anyone was
already working on code to retrieve non-overlapping HSPs from a blast
report.  He's planning to write a script to do just that.

Joseph Karalius had problems getting BlastReport to correctly parse
his blast reports from NCBI/GenBank using blastcl3.  He received
no response.

Peter Schattner asked for, and did not receive, help with LWP error
messages while he was installing some of the bioperl extensions.

Mikaela Ilinca Gabrielli suggested a feature to let you retrieve both
a protein sequence and its corresponding DNA sequence from a RefSeq.
Silence was his only response.

** Bugs

Dominik Gehl asked why after about 1000 blasts he gets a message about
"Too many open files".  Gudmundur A. Thorisson replied that he'd run
into it earlier but failed to bug report it (!)--BioPerl fails to
clean out references to temporary BLAST output files.  Jason Stajich
says he believes he has patched it, but "we REALLY need someone who
runs blast within perl to bulletproof test StandAloneBlast as people
seem to use this pretty heavily and it currently does not have a
maintainer."  The very next day someone posted a similar question
(with the wonderful subject line "HELP PLEASE").

Mathieu asked why a Bio::Seq object doesn't stringify nicely, and
Hilmar tracked it down to a bug in SeqFastaSpeedFactory, which was
unaware that Bio::Seq doesn't delegate primary_id.  He also pointed
out that most of the time you need display_id() not primary_id().

** Answered Questions

Lars G. T. Jorgensen asked where to download a large dataset and test
queries for BLAST.  Marco Aurelio Valtas Cunha responded, suggesting
finding all human ESTs on the chromosome using the NCBI EST data
and UCSC golden path chromosome data.

Jason Stajich answered Stefan Kirov's question about setting a
location in a SeqFeature object.  Docbug.  As Hilmar said:
  Our convention is that interfaces by default don't require an
  implementation to allow modification of the value of a
  property. We've been sloppy at stating the recommended way to
  implement changing a value.

Ed Dere asked how to retrieve the Comments field from a GenBank file.
Jamie Hatfield showed code with the magic incantation
  @values = $seq->annotation()->get_Annotations("comment")
  $comment = $values[0]->as_text();
and Hilmar pointed out that $ann->as_text() is probably the wrong 
thing to call, and text() is correct for Bio::Annotation::Comment
objects.

Estienne Swart asked how to list all the methods and attributes of
an object.  Marc Logghe pointed to 
  bptutorial.pl -100 Your::Favorite::Module
and when Estienne asked whether there was a general Perl function
to do this, Marc pointed to Class::Inspector on CPAN.  Heikki
immediately produced a script that interrogates any class for its
methods.

Tyler Alioto asked whether there was a size limit for Fasta searches
on large files.  Lincoln Stein explained three points of failure for
large (>2G) files: kernel, libc, and perl.  Allen Day added tcsh if
you're piping large files around.

Mathieu Wiepert posted sample code to save the results of a remote
BLAST run.

Heikki Lehvaslaiho explained to Dave Lee that while BioPerl has
modules that interface to the EMBOSS suite and there's a version of
PHYLIP in there, we have nothing to process the output of PHYLIP.
"Contributions would be more than welcome."  Shawn Hoon replied saying
that he had written some Phylip programms in the bioperl-run package
under Bio::Tools::Phylo::Phylip::*.

Steven Suchyta asked where to submit a Genbank dbEST batch submission
tool for publication.  	Mauricio Cuadra suggested BMC Bioinformatics
  http://www.biomedcentral.com/bmcbioinformatics/
and Francis Oullette seconded it.

Marco asked a Bio::Graphics question, which Lincoln answered with:
"The heterogeneous_segments glyph looks for a single feature
containing multiple subfeatures, and then changes the color depending
on the method tag of the subfeature."  This made your humble servant's
eyes sweat and he moved swiftly on to the next thread.

Remo Sanges answered Qiang Tu's question about writing protein
sequences by Bio::SeqID.

Jason Stajich explained to Nathanael Kuipers how to get phylip to
accept IDs longer than ten characters (recompile with a different
nmlngth setting).

Gert Thijs asked whether there was a bundled distribution of
bioperl-run for 1.1.0, and Jason Stajich pointed him to
  http://bioperl.org/DIST/bioperl-run-1.1.0.tar.gz