[Bioperl-l] Fwd: spliced

Tristan Lefebure tristan.lefebure at gmail.com
Mon Jul 15 13:45:21 UTC 2013


Forwarding to the list (sorry):

---------- Forwarded message ----------
From: Tristan Lefebure <tristan.lefebure at gmail.com>
Date: Mon, Jul 15, 2013 at 3:06 PM
Subject: Re: [Bioperl-l] spliced
To: "Fields, Christopher J" <cjfields at illinois.edu>


Hi Chris,
These are ensembl files, my understanding is that for this species they use
flybase as a starting point.
The file can be found here (sorry quite heavy):
ftp://ftp.ensemblgenomes.org/pub/release-19/metazoa/embl/drosophila_melanogaster/Drosophila_melanogaster.BDGP5.19.dat.gz

I selected another CDS (FBtr0310395) that also had nonsense STOPs and had a
simpler structure. Here are chunks of the ensembl file related to this CDS:


----------------------
ID   X    standard; DNA; HTG; 22422827 BP.
XX
AC   chromosome:BDGP5:X:1:22422827:1
XX
SV   X.BDGP5
XX
DT   18-JUN-2013

[...]

FT   gene            15673179..15674095
FT                   /gene=FBgn0263748
FT                   /locus_tag="CG43673"
FT   mRNA            join(15673179..15673224,15673413..15674095)
FT                   /gene="FBgn0263748"
FT                   /note="transcript_id=FBtr0310395"
FT   CDS             join(15673215..15673224,15673413..15674095)
FT                   /gene="FBgn0263748"
FT                   /protein_id="FBpp0302543"
FT                   /note="transcript_id=FBtr0310395"
FT                   /db_xref="flybase_transcript_id:FBtr0310395"
FT                   /db_xref="FlyBaseCGID_transcript:CG43673-RA"
FT                   /db_xref="FlyBaseCGID_translation:CG43673-PA"
FT                   /db_xref="FlyBaseName_transcript:FBtr0310395"
FT                   /db_xref="FlyBaseName_translation:FBpp0302543"
FT                   /db_xref="GO:0005576"
FT                   /db_xref="GO:0006030"
FT                   /db_xref="GO:0008061"
FT                   /db_xref="flybase_translation_id:FBpp0302543"

FT                   /db_xref="goslim_goa:GO:0003674"
FT                   /db_xref="goslim_goa:GO:0005575"
FT                   /db_xref="goslim_goa:GO:0005576"

FT                   /db_xref="goslim_goa:GO:0008150"
FT                   /db_xref="UniParc:UPI0002945AD1"
FT
/translation="MKVWIAQHLFVVILVSSAVPLTEALGSTVCADRFNGLSFADPASC
FT
SSFFVCQRGNAVRRECSNGLYYDPKIQTCNLPGLVKCFNGDRGGSVLGDVKANVTLVPN
FT
GKANGEVTTTPPQTTTCPPTTTVTPAVTTKKSKLILDTEDADDAHSIFQVTPHPLTNRI
FT
DVLRSQRDCRGINDGEYLTDPKHCRRFYMCHKNRVKRHNCPRNQWFDRETKSCQDRELV
FT                   LNCPVNRN"

[...]


FT   exon            15673413..15674095
FT                   /note="exon_id=FBgn0263748:2"
FT   exon            15673179..15673224
FT                   /note="exon_id=FBgn0263748:1"

[...]

XX
SQ   Sequence 22422827 BP; 6409325 A; 4742952 C; 4748415 G; 6432035 T; 90100
SQ   other;
     CAACATTAGC GCCATGCCCA CTGTGGGGAA TTTACCAGCA GCCCGCACAC TTAGCCGGCC
   60

[...]

--------------------------


I encountered 2 problems using this file with bioperl:

*1- while converting the file into fasta format, I got the following:*

>X Drosophila melanogaster chromosome X BDGP5 full sequence 1..22422827
annotated by Ensembl Genomes
SQOTHERCAACATTAGCGCCATGCCCACTGTGGGGAATTTACCAGCAGCCCGCACACTTA
GCCGGCCTGCTGCAAAGCGGGATTTATTTAATTCATCCTCCAAGAGCCCAAACGAGCATC
CTATGAGTTTCTCGGAAGTGGTAGCTGGAGCAGGTCCAGTTTCTATGGCACCCCCTAATC
[...]

See the "SQOTHER" ? Looks like the second line with the SQ header is read
as a sequence...


*2- if you extract the CDS with the attached script, you don't get the
expected sequence for this loci (and many others, not all though):*

>FBtr0310395|FBgn0263748|X
YQHVFRDCTASICGDPSVVGGAID*SFGQHRLCGSLQWPVVCGSGQLLQLLCVPAW*CRS
ARVLQWPVLRSKDPDLQSTRTSQMFQRGSRRFCAGRRQSKRHFGAQWKGQWGGHHDATTN
NHLSTDDDSDTRSDHQKIETYSRYRGCR*CPFHLPSYSASTH*QN*CAEIPARLPWNKRW
RVFDRSQTLPSFLYVP*ESGQAP*LPTESVVRSGDEILPRSRVGTELPSQS

It's totally off. I realigned the exported sequence to the genome what
bioperl has done is actually:
   join(15673208..15673216,15673405..15674088)

it was supposed to be:
  join(15673215..15673224,15673413..15674095)


Looks like real bugs, what do you think ? It would be easier to play with a
smaller example...

Thanks!
--
Tristan








On Fri, Jul 12, 2013 at 7:59 PM, Fields, Christopher J <
cjfields at illinois.edu> wrote:

> According to FlyBase that particular gene is listed as incomplete.  Do you
> have the original EMBL nuc accession # so we can test this?
>
> chris
>
> On Jul 12, 2013, at 10:16 AM, Tristan Lefebure <tristan.lefebure at gmail.com>
> wrote:
>
> > Dear bioperlers,
> >
> > I am trying to extract CDS sequences from ensembl EMBL files (*.dat),
> > but I get STOP codons where I suppose I should not get some. I am
> > using the usual way of extracting coding seq, splice the seq, and then
> > translate them:
> >
> > $feat_object->spliced_seq->translate(-codontable_id => $genetcode)->seq
> >
> > An example with this CDS:
> >
> > FT   CDS             join(81211..83052,122064..122745,122800..123184,
> > FT                   135429..135872,138620..139940,140189..141784,
> > FT                   141841..145922,145985..146089,146143..147085,
> > FT                   147145..147638,147696..148243)
> > FT                   /gene="FBgn0001313"
> > FT                   /protein_id="FBpp0289299"
> > FT                   /note="transcript_id=FBtr0300022"
> > FT                   /db_xref="flybase_transcript_id:FBtr0300022"
> > FT                   /db_xref="RefSeq_peptide:NP_001015505"
> > FT                   /db_xref="RefSeq_mRNA:NM_001015505"
> > FT                   /db_xref="Uniprot/SPTREMBL:Q281X0"
> > FT                   /db_xref="Uniprot/SPTREMBL:Q5LJP0"
> > FT                   /db_xref="FlyBaseCGID_transcript:CG17866-RB"
> > FT                   /db_xref="FlyBaseCGID_translation:CG17866-PB"
> > FT                   /db_xref="FlyBaseName_transcript:FBtr0300022"
> > FT                   /db_xref="FlyBaseName_translation:FBpp0289299"
> > FT                   /db_xref="GO:0003774"
> > FT                   /db_xref="GO:0003777"
> > FT                   /db_xref="GO:0003777"
> > FT                   /db_xref="GO:0005524"
> > FT                   /db_xref="GO:0005524"
> > FT                   /db_xref="GO:0005858"
> > FT                   /db_xref="GO:0005875"
> > FT                   /db_xref="GO:0007018"
> > FT                   /db_xref="GO:0007018"
> > FT                   /db_xref="GO:0016887"
> > FT                   /db_xref="GO:0030286"
> > FT                   /db_xref="GO:0030286"
> > FT                   /db_xref="GO:0042623"
> > FT                   /db_xref="flybase_translation_id:FBpp0289299"
> > FT                   /db_xref="goslim_goa:GO:0003674"
> > FT                   /db_xref="goslim_goa:GO:0005575"
> > FT                   /db_xref="goslim_goa:GO:0005622"
> > FT                   /db_xref="goslim_goa:GO:0005623"
> > FT                   /db_xref="goslim_goa:GO:0005856"
> > FT                   /db_xref="goslim_goa:GO:0008150"
> > FT                   /db_xref="goslim_goa:GO:0016887"
> > FT                   /db_xref="goslim_goa:GO:0043226"
> > FT                   /db_xref="goslim_goa:GO:0043234"
> > FT                   /db_xref="UniParc:UPI00018FBBAC"
> > FT
> /translation="LQGLNAQLDQVDVQQIIRVLRSTHSVYIKQIDELIFESTHELMEA
> > FT
> MENIKFLHLLMQPCSQLDFSESPTFVSQLIPRTIHLIRFIWLNSEQYNRRDLITGIFRN
> > FT
> LSNQIIRFCTEKVNVEKILSGSSRFGIKICNMCIDCCLTYKGIYDIMSKTHAKINIRIG
> > FT
> WSLDNAMIFNHVDAFMERLNDVIDICESMMVFGRLDESESIPKPQFGGTSGTEFEATAD
> > FT
> NVENEFLVTLTALCTDSKEIILNVHKNEWYEEVIKYRRTVQSMEETVQRLMSNVFQHIC
> > FT
> NVEEALESLNVMIFYSYRSTIRKTFLRQVSSAWVFFSNEIDSSVHMLMDRSKMHESWVP
> > FT
> YYASRALGYRVHLDRLVWLCNRLNSSDWLPNVSEASVVLKKFESVRREFDKEVKKSFDE
> > FT
> WQKNCCSLLLNQKLDRYLLIRSKKKKGLIECNIDRTILTICEQAQHFERLGLGVPGMVR
> > FT
> KIYEKHETLRFVYNSVVQVCLNYNHILSALSEQERKLFRALIQACDRKIAPGVFKLTYG
> > FT
> GELSDAYIADCAKHTNKLQETMDIYKRAIQNIARFCEKICDTPMLKFNFSGAVTISIFE
> > FT
> NHLSSYLRRVSNILRGFYSTITDLIFAVFKEFQAVIEDMPIEWYGFVNVFDDMLATAFL
> > FT
> TSSKNSLNMLTNALHRDPDMAAAPILVMESDVRERCIVLTPDIDVIANLLSGYIDRIHN
> > FT
> ILEQFPRIGIKMKLPKEHQYESFSKAFLEDSESTQLICNIEAEINHEREEIDGYITFWN
> > FT
> SHRMLWETTELEFTKRVKATQMTADIFEASIEYYSAMADDISYVDAITHVYFILMNQNY
> > FT
> IKSSILDCIEKWQALNIKILLSHSFSLIRAIYRYMRKNERKMMMVPRTLKESLLAKQFF
> > FT
> ERIINEVPLKQAGFPPTLELFAILDKYQVEIPEEIRVKVIGLEAAWHHYLKRLGEADEM
> > FT
> LDNNREEFKKILVQQAEKFKIILKEFLDDFFLKLPTSANINPRIALKFLRIIALKIEDC
> > FT
> FTFEESLMRDLAVFNVNQPESIDLRKLDFEVRIVKNIWELIFEWQTNWEGWKKGYFWKM
> > FT
> NINEMEDTALNLYKEFTTLNKKFYDRHWEMLEATTKNVDSFRRTLPLITALKNPCMRER
> > FT
> HWNRVRDVIHVNFDENSKNFTLELIINLDFQAFSEDIQDISNPATMELQIENSIKNIAT
> > FT
> IGKNKVLKCFYHDGIYRIKNVEDCFQLLEEHMVQISAMKATRFVEPFITIVDYWEKTLS
> > FT
> YISETLEKGLTVQRQWLYLENIFQGDDIRKQLPEEAKRFATITEEFRTISSKMFQAKTA
> > FT
> VKATNLRPPPFLLNRFSRMDERLELIQRALEIYLEAKRQLFPRFYFISNDDLLEILGNS
> > FT
> KRPDLVQTHLKKLFDNLYKLELKRVGKTLSRWQASGMHSDDGEYVEFMMVIYIDGPSER
> > FT
> WLKQVEEYMLVVMKEMLKLTRGSLKKLVGNREKWISLWPGQMVLTTAQIQWTTECTRSL
> > FT
> IHCSMVDQKKPLRKLKKKQIKVLSKLSEMSRKDLTKTMRLKVNTLITLEIHGRDVIERM
> > FT
> YKSNCKDTGHFEWFSQLRFYWHRESELCVIRQTNTEHWGCFDEFNRINIEVLSVVAQQI
> > FT
> MSIMAALSTKALELMFEGQMIKLKHTVGLFITMNPGYAGRTELPDNLKSMFRPISMMVP
> > FT
> DNIIIAENLLFSDGFTNTRNLARKVYTLYELAKQQLSKQYHYDFGLRSMVALLRYAGRK
> > FT
> RRQLPNTTEEEIVYLAMKDMNVARLTANDLPLFNGIMSDIFPGVSLPTIDYSEFNIAIY
> > FT
> EEFREAGLQPITIAVKKVIELFETKNSRHSVMIIGDTGTAKSVTWRTLQNCFYRMNSQR
> > FT
> FSGWEAVTVYPVNPKALNLAELYGEYNLSTGEWLDGVLSSIMRIICGDEEPTQKWLLFD
> > FT
> GPVDAVWIENMNSVMDDNKLLTLVNSERITMPVQVSLLFEVGDLAVASPATVSRCGMVY
> > FT
> NDYNDWGWKPFVNSWLQRLRIKEFADFLRIHFDYMVPKILDFKRMRCKEPVRTNELNGV
> > FT
> VSLCKLLEIFGTKVNGINPINLELLEEMTRLWFMFCLVWSICSSVDEDSRQRLDSFIRE
> > FT
> LESCFPIKDTVFDYFVDPNERTFLPWDSKLLSSWKCDFESPFYKIIVPTGDTVRYEYVV
> > FT
> SKLLAEEYPVMLVGNVGTGKTSTAISVMEACDKNKFCILAVNMSAQTTAAGLQESIENR
> > FT
> TEKRTKTQFVPIGGKRMICFMDDFNMPAKDIYGSQPPLELIRQWIDYKYWFNRKTQQKI
> > FT
> YVQNTLLMAAMGPPGGGRQTISSRTQSRFVLLNLTFPSQETIIRIFGTMLCQKLESYPN
> > FT
> EVREMWLPITLCTINLYVSMISKMLPTPNKSHYLFNLRDISKVFQGLLRSEKELQNKKN
> > FT
> FFLRLWVHECFRVFSDRLVDDSDQFWFVNTINDILGKHFEVTFHSLCPSKVPPFFGDFA
> > FT
> HPQGFYEDLQVDFLRTFMKNQLEEYNNFPGMTRMNLVFFREAIEHIVRILRVISQPRGH
> > FT
> ILNMGIGGSGRQVLTKLAAFILEMAVFQIEVTKKYKTGDFREDLKNLYKVTGIKQRLTI
> > FT
> FIFSSDQIAEVSFLEITNNMLSTGEINLFKSDEFDELKPELERPAKKNGVLLTTEALYS
> > FT
> YFILNVRDFLHVALCFSPIGENFRSYIRQYPALLSSTTPNWFRFWPQEALLEVASHFLI
> > FT
> GFPLNVVVSGKEDEKHRESLVISTEAILQRDIAYVFSVIHSSVAKMSENMYAEVKRYNY
> > FT
> VTSPNYLQLVSGFKKLLEKKRLEVSTASNRLRNGLSKISETQEKVSLMSEELKASSEQV
> > FT
> KILARECEDFISMIEIQKSEATEQKEKVDAEAVLIRRDEIICLELAATARADLEVVMPM
> > FT
> IDAAVKALDALNKKDISEVKSYGRPPMKIEKVMEAVLILLGKEPTWENAKKVLSESTFL
> > FT
> NDLKNFDRDHISDKTLKRIAIYTKNPELEPDKVAVVSLACKSLMQWIMAIENYGKVYRI
> > FT
> VAPKQEKLDSAMKSLEEKQAALAAAKKKLEELQVVIEELYRQLEEKTNLLNELRAKEER
> > FT
> LRKQLERAIILVESLSGERERWIETVNQLDLSFEKLPGDCLLSVAFMSYLGAFDTKYRE
> > FT
> ELLVKWSLLIKDLLIPATLELKVTYFLVDAVSIREWNIQGLPADDLSTENGVIVTQGSR
> > FT
> WPLIIDPQMQANNWIKNMEERNQLMTLDFGMADYLRQLERALKEGLPVLLQNVGEYLDQ
> > FT
> AINPILRQSFTIQSGERLLKFNDKYISYNNSFRFYITTKISNPHYPPEISSKTTIVNFA
> > FT
> LKQDGLEAQLLGIIVRKEKPALEEQKDELVMTIARNKRTLIDLDNEILRLLNESRGSLL
> > FT
> DDDELFSTLQKSRQTSVLVKESLSIAEVTEVEIDAARQEYKPASERASILFFVLMDMSK
> > FT
> IDPMYVFSLAAYILLFTQSIERSPRNQLVHERIQNINEYHSYAVYRNTCRGLFERHKLL
> > FT
> FSIHMTAKILSNAGKLLEEEYDFILKGGIVLDKLGQAPNPAPWWISEQNWDNITELDKV
> > FT
> SGFHGIIDSFEQHYKAWNGWYATTFPEQEDLVGEWNDKLTDFQKICVLRSLRPDRISFC
> > FT
> LTQFIITKLGPRYVDPPVLDLKATFDESISQTPLIFVLSPGVDPAQSLISLSESVKMAQ
> > FT
> RMYSLSLGQGQAPIATKLIMDGIKDGNWVFLANCHLSLSWMPTLDKMIATMQSMKLHKK
> > FT
> FRLWLSSSPHPDFPISILQTSIKMTTEPPRGIKSNMKRLYNNINEANMENCSEPSKYKK
> > FT
> LLFALCFFHTVLLERKKFLELGWNVIYSFNDSDFEVSEILLLLYLNEYEDTPWGALKYL
> > FT
> IAGVNYGGHITDDWDRRLLITYINQFFCDQALQTRKFRLSTLPNYFIPDDGDVQSYLDQ
> > FT
> IQMFPNFDKPDAFGQHSNADIASLIGETRMLFEALLSMQVQTNSTSSNENGETKVFDLA
> > FT
> KEILMNTPDEINYEQTAKIIGINRTPLEVVLLQEIERYNKLLVDMSTQLRDLRRGIQGL
> > FT
> VVMSSDLEDIYLAVSEGRVPLQWLKAYNSLKPLAAWARDLIHRVGHFNSWAKTLRPPIL
> > FT
> FWLAAYTFPTGFVTAVLQTSARATKTPIDELSWDFYVFVEEDTAAARIIREGGGVYIRS
> > FT
> LFLEGGGWLRKNQCLQDPLPMELICPLPVIHFKPVENLKKRCRGVYQCPAYYYPVRSGS
> > FT                   FVIAVDLKSGNEKADYWIKRGTALLLSLAN"
> >
> >
> > Which gives:
> >
> >> FBtr0300022__FBgn0001313
> >
> LQGLNAQLDQVDVQQIIRVLRSTHSVYIKQIDELIFESTHELMEAMENIKFLHLLMQPCSQLDFSESPTFVSQLIPRTIHLIRFIWLNSEQYNRRDLITGIFRNLSNQIIRFCTEKVNVEKILSGSSRFGIKICNMCIDCCLTYKGIYDIMSKTHAKINIRIGWSLDNAMIFNHVDAFMERLNDVIDICESMMVFGRLDESESIPKPQFGGTSGTEFEATADNVENEFLVTLTALCTDSKEIILNVHKNEWYEEVIKYRRTVQSMEETVQRLMSNVFQHICNVEEALESLNVMIFYSYRSTIRKTFLRQVSSAWVFFSNEIDSSVHMLMDRSKMHESWVPYYASRALGYRVHLDRLVWLCNRLNSSDWLPNVSEASVVLKKFESVRREFDKEVKKSFDEWQKNCCSLLLNQKLDRYLLIRSKKKKGLIECNIDRTILTICEQAQHFERLGLGVPGMVRKIYEKHETLRFVYNSVVQVCLNYNHILSALSEQERKLFRALIQACDRKIAPGVFKLTYGGELSDAYIADCAKHTNKLQETMDIYKRAIQNIARFCEKICDTPMLKFNFSGAVTISIFENHLSSYLRRVSNILRGFYSTITDLIFAVFKEFQAVIEDMPIEWYGFVNVFDDMLATAFLTSSKNSLNMLTNALHRDPDMAAAPILVMESDVRERCIVLTPDIDVIANLLSGYIDRIHNILEQFPRIGIKMKLPKEHQYESFSKAFLEDSESTQLICNIEAEINHEREEIDGYITFWNSHRMLWETTELEFTKRVKATQMTADIFEASIEYYSAMADDISYVDAITHVYFILMNQNYIKSSILDCIEKWQALNIKILLSHSFSLIRAIYRYMRKNERKMMMVPRTLKESLLAKQFFERIINEVPLKQAGFPPTLELFAILDKYQVEIPEEIRVKVIGLEAAWHHYLKRLGEADEMLDNNREEFKKILVQQAEKFKIILKEFLDDFFLKLPTSANINPRIALKFLRIIALKIEDC!
> >
> FTFEESLMRDLAVFNVNQPESIDLRKLDFEVRIVKNIWELIFEWQTNWEGWKKGYFWKMNINEMEDTALNLYKEFTTLNKKFYDRHWEMLEATTKNVDSFRRTLPLITALKNPCMRERHWNRVRDVIHVNFDENSKNFTLELIINLDFQAFSEDIQDISNPATMELQIENSIKNIATIGKNKVLKWLLS*WYL*NKKR*GLFSAP*RTHGTNIGYESNSFC*AIYNHC*LLGKNTIVHK*DSGKGFNCSAPMALPRKYIPRRRHKKTTSRRGKTFCNNN*RVSNNIKQNVPGKDSRKSH*LTPSTVFIKPF*SNGRKTGTYSTCLRNLS*G*TTTFSKILFYF***PFRNFRKF*AAGLSSNPP*EVI**FIQA*AQARWENFKSVASFWNAFRRWRIC*VHDGYLYRWTIGALAKTSRRVHACCYERDA*TYSRIS*KTCREQRKMDFALARTNGANHSSDPMDN*VYA*PNSL*YG*SKKTPTQAKEKANKSSF*IIRNESKRPNKNNAP*SKYPHNA*NTWS*CYRKNV*IKL*GYGPF*MVFTTQILLAP*IGTMCNKADKHRALGMFR*V*SNKY*SALSRGTTNNVYNGSAFYKGVGAYVRGSNDKVKAHSWSIHYYESWICRTD*TS**FKVNV*THINDGT**YNYCGKFTFFGWFY*YKKLGPKGIYVV*AG*AATFKAISL*FWSSLNGGFASLRGSKKTSITKYY*RRNCLFGNERYECCEINS**FTPF*WYYV*HISWC*LTNYRLQ*I*YCDL*RI*GGGSPTNYHSRKKSN*AF*NKKL*ALSYDHRGYGNSQISYMENITKLFLSNE*SKIFRMGSSHRLPSKSKSIESSRALWGIQLVDW*MA*RSFKFYYANNLWR*RADSEMVVV*WTCGCSMD*KHELSNG***TSYACK*RTYNHASSSIAIV*SRRPGCCFTSNCFPMWNGL*RLQ*LGMETFCKLMVTAPKN*GVR*FFTNTF*LHGAKNTGF*T!
> > NEVQRACKDK*VKWSCVAL*IARNIWHKGKWDKSH*FRTS*GDD*IVVYVLFSMVNLFKCG*RQSPKTR*
> >
> LYTGTRKLLSNKRYCV*LFCGSQ*TNLFTMG*QAVEQLEMRFRISFLQDYCSYWRHCSL*ICCFKTSC*RISCDACWKCWYRKNVNGYKCNGGL**K*ILHFSCEHVSTDNSSRVTRINRKSD*ETYKNAICTYRWQTDDMFYGRL*YACKRHLWISATFGAYSAMDRLQVLV**KNSTKNICAKHIINGCDGTAWRGQTNNFQSNSKSVCFIKLNFSFTRNNYSHIWNDALSKTRVIPK*SS*DVATYNPLYH*LICIND**NVTDAK*ISLLI*S*RYIQSLSRTIKK*KRTSKQKKFFFTALGS*VF*SVQRPIG*RLRSVLVCKYY**YTW*TF*SYFSQSLSFKGSTIFR*LCSPSRVLRRSTGRFLKNIYEKST*GI*QLSRND*NEPSIF*RSYRTYCSNPESYFPTAWTHFKYGDRWIRPTSINQVSCVYFGNGSFPN*GYQKIQNRRLSRRPKKLIQSNWN*TETNDFYI*QRPNSRSLISRNNKQYAKYWRNKLI*IR*IRRAKA*T*TPGKKKWGSANN*STIFLFYFKCARLPACCALF*PNRRKFSKLYKTISGFVKFNNSKLV*ILATRSPFGSSFAFSNRISIKRSGFWKRGRKTSRKFGYKHRSHSSTRYCLCIFSNSLKCC*NVGKYVCRS*AL*LCNLTKLFAACKWF*KTIRKEKIRSINCFQ*ITQWAFKNF*NSGKSILNVRRA*S*L*TS*NTC*RM*RFYIHD*NSKE*SNGTKGKSGCRSRAY*KG*NNLS*ISSYSSCGLGGGNAYDRCCCKSIRCIE*ERHFRS*IIWTAANENRKGYGSCIDLTWKRTNMGKC*KSFK*INIFERPKKL**RSYFR*NS*TYCNLYKKS*VRAR*SGCCIACVQIIDAMDNGHRKLRKSLPNSRSKAGKIR*CNEVT*RKASCFSCGKKKTRRASGCH*RTLPAA*RKN*PS**IACQGRTT*KATGACHYFGRIAFWRERKVD*NG*SVGLIL*KTSR*LL!
> >
> AFCCVYVVLRGF*HQIPRRITCKMVFIN*RSFNTSNFGA*GYVFSSRCCFDSRMEYSRSTC**FKY*KRSNSYSR*SLASYY*PSNAS**LDKKYGRA*SINDTRFRYGRLLTSARTSSKRRFACIVAKRGGILRSSY*SNFAAELYHSKWRKVIKI**QVYFIQ*FVQILHNDKNIKSTLPTGNLIKNYYCKFCTKARWA*SPTTRNYCSKRKTRPRRTKRRTGNDNSSKQTDINRSR**DSTAT**KSRFLIR*R*VIFNFTKIPSDISAS*GVA*HCRGN*SRN*CGPTRIQTSIGTRIHFILCFNGYV*N*SNVCFFSGSVYIIIHTVY*AKSS*SASP*KNSKY**ISFLCGLPKYLSWAFRAT*ATIFNSYDSKDSFKRWKAFGRRV*FYSERRYSIR*TGTSAQPGTMVDK*AKLG*YNRIR*SFWISWDNRFF*ATLQGLEWLVCHDLPRTRRSRWRME**TYRFSKNLCFTFTSTG*NFFLFDTIYYYQTWASIC*SASS*SQGNF**IDFTDSPHIRIITRCRSSPISHITIRIS*NGTTNVLT*LGSRTSTYCNKAYNGWHQGW*LGIFSKLSFVS*LDAYS*QDDSHYAVHETT*KISTVAKLKPSSGLSNIYFANQY*DDN*TSSWNQIKYETSI*QHK*G*YGKL**TQQV*EVIIRFVLLSYSPTRTKKIFRTWLECYLQL*RF*F*SFRNTTIIVS**I*RHSLGSFKVSHSRSKLRRTHYRRLGSPTINNLYKPIFL*PSIAD*KV*IINPSKLFYSR*RRCAIIFRPNTNVSQF**A*CFWTTFKCRYSVINRRNKNAF*GSAFYASPD**HK***KR*DKSI*SR*RNFNEYTG*DKL*TDGKNYWNQSNSLRSCLTSRN*AL**TSR*HVHSIT*LKTWNTGTCCNEFGLRGYLSSCL*RKGAITMVKSI*FIETISGMG*RLNTSCRTF**LGENTPPSNIILACSLHVSNWICYSSTTNFSSSYQN!
> > TN**TLLGFLCFC*RRYCRSSYNKGRRRRLHSKFVFGGWRMVEEKPMPSGSTTDGTNLSITSNTL*ASRK
> > PKKTMSWCLPVSRILLSR*VRIICNSRGLKVW**KG*LLDKARYCTFIKFSKL
> >
> >
> > The start is good, but then it gets bad...
> > The problem seems the same as this old one:
> > http://bioperl.org/pipermail/bioperl-l/2004-August/016735.html
> >
> > I must be missing something....
> >
> > Thanks for your help!
> >
> > --
> > Tristan
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: embl2cds.pl
Type: application/octet-stream
Size: 1282 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20130715/9bd6b607/attachment-0004.obj>


More information about the Bioperl-l mailing list