[Bioperl-l] creating a simplealign object from strings

Nathan Haigh nathanhaigh at ukonline.co.uk
Wed Oct 20 11:23:42 EDT 2004


There must be some non-standard characters in the sequence that you are trying to set! If you look at Bio::PrimarySeq::validate_seq
method, you will notice that the only allowed characters are (A-Z) and '-','.', '*' and '?'

You should do some error checking on your sequence before you try to add it to your alignment object. a simple substitution regexp
for any chars not allowed, and convert them to missing data or gaps to maintain your alignment. If you do this you will notice that
there is a number 7 in the sequence you showed.

Hope this helps

Nathan

> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-bounces at portal.open-bio.org] On Behalf Of luisa pugliese
> Sent: 20 October 2004 15:39
> To: Bioperl-l at portal.open-bio.org
> Subject: [Bioperl-l] creating a simplealign object from strings
> 
> Dear all,
>     I am working with a huge protein alignment that comes from a web page
> and I would like to import it into bioperl. In order to do this I created a
> hash with seqnames as keys and sequences, including gaps, as values. Then I
> used LocatableSeq in order to create sequence objects that can be then
> imported into a simplealign object with the function add_seq.
> Using LocatableSeq I got the following error:
> ----------------------------
> MSG: Attempting to set the sequence to
> [----------DPQMWDFDDLNFTG-MP-PADE-DYSPCMLET-ETLNKYVV---II---AYA--LVF--L--LSL
> LGNSL-----VM----L----VILY-----SRV-----GR-----SV-TD-----V--YL-LN-LALADLLFAL--
> ---TLPIWAASKV-----NG------WIF--GTF-----LC-KVVSL----L----K----E-----VN---FY--
> -SG-I-----LL-----LACISV--D-R--YL-----AIVH---ATR---T-----L----TQ---K---RH----
> L-----VK-FV-----CLG-----CWGLSMNLSL-----PFFLFRQ----AY-----HPNNSSP--VCYEVLG---
> --NDTAKWRMVL-----RI--LPHT--FGFIVPLFVMLFCYGF-TLRTLFKAHM-----------------GQ---
> -KHR-----A---M-RVIFAVV-LIFLLC---WLP----YNLVLLADTLMRTQV7TCERRNNIGRALDATEILGFL
> ---H---SC-----L--NP----IIYA-----FIGQ---N-----FRH----GF-----L-----K--ILAMHGLV
> SKEFLA--RHRVTS-----------------------------] which does not look healthy
> STACK Bio::PrimarySeq::seq
> /home/luisa/bioperl_new/bioperl-1.4/Bio/PrimarySeq.pm:268
> STACK Bio::PrimarySeq::new
> /home/luisa/bioperl_new/bioperl-1.4/Bio/PrimarySeq.pm:217
> STACK Bio::LocatableSeq::new
> /home/luisa/bioperl_new/bioperl-1.4/Bio/LocatableSeq.pm:100
> STACK toplevel /home/luisa/perl/string2seq.pl:33
> -------------------------------
> This is not the first sequence in the file and on a smaller set the script
> worked.
> Does anybody knows what should I do in order to avoid this problem?
> Thanking you all,
> best regards
> Luisa
> 
> =============================
> Luisa Pugliese, Ph.D.
> luisa.pugliese at safan-bioinformatics.it
> S.A.F.AN. BIOINFORMATICS
> Corso Tazzoli 215/13 -10137 Torino - ITALY
> tel +39 011 3026230
> fax +39 011 3165080
> cell. +39 333 6130644
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
---
avast! Antivirus: Outbound message clean.
Virus Database (VPS): 0442-3, 15/10/2004
Tested on: 20/10/2004 16:23:40
avast! is copyright (c) 2000-2003 ALWIL Software.
http://www.avast.com






More information about the Bioperl-l mailing list