[Bioperl-l] Packages retrieving online alignment sequences
Jun Yin
jun.yin at ucd.ie
Fri Aug 6 10:52:14 UTC 2010
Hi, all,
I am the google summer of code student working on refactoring Bio::Align
subsystem. I recently implemented several packages retrieving online
alignment sequences. The aim of the packages are to provide convenient
methods to retrieve online alignment sequences for the BioPerl users. The
alignment sequences are converted into Bio::SimpleAlign object after the
retrieval, which will be easy to manipulate and write to local disk. Now the
packages support Pfam, Rfam, Prosite and Entrez Protein Clusters databases.
Here is the structure of the packages:
Packages
Bio::DB::Align (interface, and calling other packages)
Bio::DB::Align::Pfam (retrieving alignment from Pfam)
Bio::DB::Align::Rfam (retrieving alignment from Rfam)
Bio::DB::Align:Prosite (retrieving alignment from Prosite)
Bio::DB::Align:ProtClustDB (retrieving alignment from Entrez Protein
Clusters Database)
Usually four methods are provided for each package:
Methods
get_Aln_by_id (retrieving alignment by id and returns Bio::SimpleAlign
object)
get_Aln_by_acc (retrieving alignment by acession and returns
Bio::SimpleAlign object) (Rfam and Prosite only supports this method)
id2acc (id to accession conversion)
acc2id (accession to id conversion)
These packages are built dependent on LWP::UserAgent, HTTP::Request and
Bio::DB::GenericWebAgent. Bio::DB::Align::ProtClustDB is dependent on
Bio::DB::EUtilities.
Calling the packages can be:
my $dbobj=Bio::DB::Align->new(-db=>"rfam");
Or, my $dbobj= Bio::DB::Align::Pfam->new();
my $aln=$dbobj->get_Aln_by_acc("RF0001");
my $aln2=$dbobj->get_Aln_by_acc(-accession=>"RF0001",-alignment=>"full");
print $aln->length();
foreach my $seq ($aln->each_Seq) {
#do something
}
I have done some tests on these packages. And, I will write them into
standard tests later. Any suggestions on these packages are welcome.
Cheers,
Jun Yin
Ph.D. student in U.C.D.
Bioinformatics Laboratory
Conway Institute
University College Dublin
More information about the Bioperl-l
mailing list