[Bioperl-l] bioperl newcomer's questions

Jason Raymond jasonraymond@asu.edu
Mon, 20 Aug 2001 18:52:09 -0700


This is a multi-part message in MIME format.

--Boundary_(ID_a4f6NH00C0j5NN/9dd1OAA)
Content-type: text/plain;	charset="iso-8859-1"
Content-transfer-encoding: quoted-printable

Greetings,
I'm fairly new to Perl and brand new to Bioperl but I'm excited about =
what I've seen so far.  Specifically, I want to learn Bioperl to perform =
two immediate tasks (which will hopefully be elaborated upon in the long =
run).  I have checked quite a few of the news archives and am not sure =
if these are current tasks or perhaps readily available scripts; if not =
any pointers on how to get started are greatly appreciated!
thanks in advance,
JR

task 1:
full sequence (not HSP) retrieval from online db's; so that given a =
query sequence, bioperl would blast (for example) the ncbi database, =
extract all accession numbers above a given threshold, and then (rather =
than just parse and return HSP's as this is frustrating in sequence =
alignment) return the entire protein or gene corresponding to that =
accession number.

task 2 (perhaps computationally related to task 1):
local sequence retrieval given a local genome database and a query =
sequence; given a query sequence, blast against an organism's genome (or =
multiple organism's genomes) and, upon finding the best hits above a =
certain threshold, attempt to extract the gene coding for this match by =
finding, in frame with the HSP, an upstream start codon and a downstream =
stop codon.  Once the full genes are extracted it would be good to do a =
quick pairwise alignment of them versus the query so that false =
positives can thereby be eliminated.



--Boundary_(ID_a4f6NH00C0j5NN/9dd1OAA)
Content-type: text/html;	charset="iso-8859-1"
Content-transfer-encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type>
<META content=3D"MSHTML 5.00.3315.2870" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>Greetings,</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>I'm fairly new to Perl and brand new to =
Bioperl but=20
I'm excited about what I've seen so far.&nbsp; Specifically, I want to =
learn=20
Bioperl to perform two immediate tasks (which will hopefully be =
elaborated upon=20
in the long run).&nbsp; I have checked quite a few of the news archives =
and am=20
not sure if these are current tasks or perhaps readily available =
scripts; if not=20
any pointers on how to get started are greatly appreciated!</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>thanks in advance,</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>JR</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>task 1:</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>full sequence (not HSP) retrieval from =
online db's;=20
so that given a query sequence, bioperl would blast (for example) the =
ncbi=20
database, extract all accession numbers above a given threshold, and =
then=20
(rather than just parse and return HSP's as this is frustrating in =
sequence=20
alignment) return the entire protein or gene corresponding to that =
accession=20
number.</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>task 2 (perhaps computationally related =
to task=20
1):</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>local sequence retrieval given a local =
genome=20
database and a query sequence; given a query sequence, blast against an=20
organism's genome (or multiple organism's genomes) and, upon finding the =
best=20
hits above a certain threshold, attempt to extract the gene coding for =
this=20
match by finding, in frame with the HSP, an upstream start codon and a=20
downstream stop codon.&nbsp; Once the full genes are extracted it would=20
be&nbsp;good to do a quick pairwise alignment of them versus the query =
so that=20
false positives can thereby be eliminated.</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV>&nbsp;</DIV></BODY></HTML>

--Boundary_(ID_a4f6NH00C0j5NN/9dd1OAA)--