[Bioperl-l] Stockholm to fasta

shalabh sharma shalabh.sharma7 at gmail.com
Tue Sep 22 20:17:11 UTC 2009


Hi Chris,           Thanks a lot it was really helpful.

Thanks
Shalabh


On Tue, Sep 22, 2009 at 1:13 PM, Chris Fields <cjfields at illinois.edu> wrote:

> The POD for Bio::AlignIO::stockholm indicates where the various bits of
> information are stored.  Everything from the header should be in there in
> the latest bioperl; in many cases it's not ideally stored, but it's
> accessible.
>
> You'll need to preprocess your seqs in the SimpleAlign returned (iterate
> through them and change the relevant bits like desc(), displayname(),
> seq_id, etc) and may need to do other modifications, but it should work.
>
> chris
>
>
> On Sep 22, 2009, at 11:48 AM, shalabh sharma wrote:
>
>  Hi All,      I am trying to convert stockholm to fasta format. I am using
>> "sreformat" for this purpose. I am getting a fasta file but the problem is
>> i
>> want header information from stockholm in my fasta file.
>> Like:
>> # STOCKHOLM 1.0
>>
>> #=GF AC   RF00003
>> #=GF ID   U1
>> #=GF DE   U1 spliceosomal RNA
>> - - - - - - - - - -  - - - -
>> - - - - - - - - - - - -- -
>> - - - - - - -- - - - - -
>> #=GF RL   J Biol Chem 2001;276:21476-21481.
>> #=GF CC   U1 is a small nuclear RNA (snRNA) component of the spliceosome
>> #=GF CC   (involved in pre-mRNA splicing). Its 5' end forms complementary
>> #=GF CC   base pairs with the 5' splice junction, thus defining the 5'
>> #=GF CC   donor site of an intron.
>> #=GF CC   There are significant differences in sequence and secondary
>> #=GF CC   structure between metazoan and yeast U1 snRNAs, the latter being
>> #=GF CC   much longer (568 nucleotides as compared to 164 nucleotides in
>> #=GF CC   human). Nevertheless, secondary structure predictions suggest
>> #=GF CC   that all U1 snRNAs share a 'common core' consisting of helices
>> I,
>> #=GF CC   II, the proximal region of III, and IV [1].
>> #=GF CC   This family does not contain the larger yeast sequences.
>> #=GF SQ   100
>>
>>
>> X63783.1/2024-2186
>> UUACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X63783.1/1394-1556
>> UUACUUACCUGGCUGG.AGUUA.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X58845.1/1-161
>> ..ACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X63783.1/596-756
>> UAAAUUACAAUGUUGU.AGUUA.GCUA...UAUAUCAA.AAAA.UAUAG.
>> M29062.1/238-387
>> UUACUUACCUGGCAUG.AGUUU..CUG...CAGCACAA.GAAU.UGUGG.
>>
>> As a output i am just getting a fasta file with the headers like
>> "X63783.1/2024-2186" but what i want is that it should include some
>> information like U1 or U1 spliceosomal RNA from the stockholm headers.
>>
>> I would really appreciate if anyone can help me out.
>>
>> Thanks
>> Shalabh
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>



More information about the Bioperl-l mailing list