[Bioperl-l] Rfam/Pfam annotations and SimpleAlign
Chris Fields
cjfields at uiuc.edu
Fri Oct 27 18:34:57 UTC 2006
I am working an refactoring the AlignIO::stockholm parser to get it reading
and writing Pfam/Rfam alignments, and noticed that many alignments have
EMBL-like annotations attached, which pertain to the entire alignment:
# STOCKHOLM 1.0
#=GF ID ykkC-yxkD
#=GF AC RF00442
#=GF DE ykkC-yxkD element
#=GF AU Moxon SJ
#=GF GA 20.0
#=GF NC 0.1
#=GF TC 59.4
#=GF SE Barrick JE, Breaker RR
#=GF SS Predicted; Barrick JE, Breaker RR
#=GF TP Cis-reg; riboswitch;
#=GF BM cmbuild CM SEED
#=GF BM cmsearch -W 175 CM SEQDB
#=GF RN [1]
#=GF RM 15096624
#=GF RT New RNA motifs suggest an expanded scope for riboswitches in
#=GF RT bacterial genetic control.
#=GF RA Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, Collins J,
Lee
#=GF RA M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR;
#=GF RL Proc Natl Acad Sci U S A 2004;101:6421-6426.
#=GF CC This family represents the bacterial ykkC/yxkD element. The
function of
#=GF CC this family is unclear although it has been suggested that it may
function
#=GF CC to switch on efflux pumps and detoxification systems in response
to harmful
#=GF CC environmental molecules [1]. The Thermoanaerobacter tengcongensis
sequence
#=GF CC EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that the two
#=GF CC riboswitches may work in conjunction to regulate the the upstream
gene
#=GF CC which codes for Swiss:Q8RC62, a member of Pfam:PF00860 (Personal
obs. Moxon
#=GF CC SJ).
#=GF SQ 16
SimpleAlign, as implemented, seemingly doesn't have a way to store this
information.
I'll work on getting the core alignment IO working, but would there be any
interest in having a way to store annotations in Bio::SimpleAlign? I'm
guessing the methods would be similar to the various Bio::Seq Annotation
methods.
Chris
Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list