[Bioperl-l] bioperl newcomer's questions
Jason Raymond
jasonraymond@asu.edu
Mon, 20 Aug 2001 18:52:09 -0700
This is a multi-part message in MIME format.
--Boundary_(ID_a4f6NH00C0j5NN/9dd1OAA)
Content-type: text/plain; charset="iso-8859-1"
Content-transfer-encoding: quoted-printable
Greetings,
I'm fairly new to Perl and brand new to Bioperl but I'm excited about =
what I've seen so far. Specifically, I want to learn Bioperl to perform =
two immediate tasks (which will hopefully be elaborated upon in the long =
run). I have checked quite a few of the news archives and am not sure =
if these are current tasks or perhaps readily available scripts; if not =
any pointers on how to get started are greatly appreciated!
thanks in advance,
JR
task 1:
full sequence (not HSP) retrieval from online db's; so that given a =
query sequence, bioperl would blast (for example) the ncbi database, =
extract all accession numbers above a given threshold, and then (rather =
than just parse and return HSP's as this is frustrating in sequence =
alignment) return the entire protein or gene corresponding to that =
accession number.
task 2 (perhaps computationally related to task 1):
local sequence retrieval given a local genome database and a query =
sequence; given a query sequence, blast against an organism's genome (or =
multiple organism's genomes) and, upon finding the best hits above a =
certain threshold, attempt to extract the gene coding for this match by =
finding, in frame with the HSP, an upstream start codon and a downstream =
stop codon. Once the full genes are extracted it would be good to do a =
quick pairwise alignment of them versus the query so that false =
positives can thereby be eliminated.
--Boundary_(ID_a4f6NH00C0j5NN/9dd1OAA)
Content-type: text/html; charset="iso-8859-1"
Content-transfer-encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type>
<META content=3D"MSHTML 5.00.3315.2870" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>Greetings,</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>I'm fairly new to Perl and brand new to =
Bioperl but=20
I'm excited about what I've seen so far. Specifically, I want to =
learn=20
Bioperl to perform two immediate tasks (which will hopefully be =
elaborated upon=20
in the long run). I have checked quite a few of the news archives =
and am=20
not sure if these are current tasks or perhaps readily available =
scripts; if not=20
any pointers on how to get started are greatly appreciated!</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>thanks in advance,</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>JR</FONT></DIV>
<DIV> </DIV>
<DIV><FONT face=3DArial size=3D2>task 1:</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>full sequence (not HSP) retrieval from =
online db's;=20
so that given a query sequence, bioperl would blast (for example) the =
ncbi=20
database, extract all accession numbers above a given threshold, and =
then=20
(rather than just parse and return HSP's as this is frustrating in =
sequence=20
alignment) return the entire protein or gene corresponding to that =
accession=20
number.</FONT></DIV>
<DIV> </DIV>
<DIV><FONT face=3DArial size=3D2>task 2 (perhaps computationally related =
to task=20
1):</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>local sequence retrieval given a local =
genome=20
database and a query sequence; given a query sequence, blast against an=20
organism's genome (or multiple organism's genomes) and, upon finding the =
best=20
hits above a certain threshold, attempt to extract the gene coding for =
this=20
match by finding, in frame with the HSP, an upstream start codon and a=20
downstream stop codon. Once the full genes are extracted it would=20
be good to do a quick pairwise alignment of them versus the query =
so that=20
false positives can thereby be eliminated.</FONT></DIV>
<DIV> </DIV>
<DIV> </DIV></BODY></HTML>
--Boundary_(ID_a4f6NH00C0j5NN/9dd1OAA)--