[Bioperl-l] a space in Feature key
Heikki Lehvaslaiho
heikki at ebi.ac.uk
Thu Oct 23 04:41:31 EDT 2003
Checking with EMBL databank guys, I find out that since feature table
format is defined in characters, space characters are not specifically
banned in feature keys, but they are actively avoided.
Also, I talked to Rodrigo Lopez who promised to release an updated
CpGIsle database soon (next week?) and remove the offending space.
Yours,
-Heikki
On Fri, 2003-10-17 at 21:03, Henry Hyun-il Paik wrote:
> Hello Ewan,
>
> I downloaded data from
>
> ftp://ftp.ebi.ac.uk/pub/databases/cpgisle/
>
> the file name is cpgisle.dat
>
> - Henry
>
> On Fri, 17 Oct 2003, Ewan Birney wrote:
>
> >
> > On Friday, October 17, 2003, at 07:15 pm, Henry Hyun-il Paik wrote:
> >
> > >
> > > Hello list,
> > >
> > > It is impossible to have a space in Feature key, Right?
> > >
> > > I downlaoded some data from embl cpgisle. They look like below.
> > >
> >
> > I don't think you are allowed spaces. Where did you get this from?
> >
> >
> >
> > > -----------------------------------------------------------------------
> > > ----
> > > ID GAPDHG
> > > AC J04038;
> > > LE 5378
> > > DE Human glyceraldehyde-3-phosphate dehydrogenase (GAPDH) gene,
> > > complete
> > > cds.
> > > DE 7/95
> > > EX Gene expression widespread
> > > FT CpG island 871..1673
> > > FT /size=803
> > > FT /%(C+G)=69.12
> > > FT /Obs/Exp CpG=0.82
> > > FT CpG island 1683..2063
> > > FT /size=381
> > > FT /%(C+G)=67.19
> > > FT /Obs/Exp CpG=0.77
> > > XX
> > > FT /CAAT-box.1="884"
> > > FT /CAAT-box.2_complement="2156"
> > > FT /GC-box="1064"
> > > FT /E2F_CS.1="1785"
> > > FT /SpI="158,1198,1244,1290,1310,1314"
> > > FT /SpI_complement="174,584,1519,1668,1736,2271"
> > > FT /SpI_complement="2625"
> > > FT /AccII="717,727,1093,1268,1334,1423"
> > > FT /AccII="1489,1531,1788,2006,3650,4278"
> > > //
> > > -----------------------------------------------------------------------
> > > -
> > >
> > > I tried to parse this by using SeqIO. It didn't work.
> > >
> > > I got an error message like below.
> > >
> > >
> > > -----------------------------------------------------------------------
> > >
> > > Argument "island" isn't numeric in numeric gt (>) at
> > > /home/hy1001/bin/Bio/Location/Atomic.pm line 91, <GEN0> line 15.
> > > Argument "island" isn't numeric in numeric gt (>) at
> > > /home/hy1001/bin/Bio/Location/Atomic.pm line 91, <GEN0> line 15.
> > >
> > > ------------- EXCEPTION -------------
> > > MSG: Got a sequence with no letters in - cannot guess alphabet []
> > > STACK Bio::PrimarySeq::_guess_alphabet
> > > /home/hy1001/bin/Bio/PrimarySeq.pm:817
> > > STACK Bio::PrimarySeq::seq /home/hy1001/bin/Bio/PrimarySeq.pm:276
> > > STACK Bio::PrimarySeq::new /home/hy1001/bin/Bio/PrimarySeq.pm:214
> > > STACK Bio::Seq::new /home/hy1001/bin/Bio/Seq.pm:498
> > > STACK Bio::Seq::RichSeq::new /home/hy1001/bin/Bio/Seq/RichSeq.pm:115
> > > STACK Bio::Seq::SeqFactory::create
> > > /home/hy1001/bin/Bio/Seq/SeqFactory.pm:126
> > > STACK Bio::SeqIO::embl::next_seq /home/hy1001/bin/Bio/SeqIO/embl.pm:344
> > > STACK toplevel extracted.pl:13
> > >
> > >
> > > -----------------------------------------------------------------------
> > >
> > > So I changed 'CpG island' to 'CpG_island'. Then it worked fine.
> > >
> > > I am using perl 5.8.0 and bioperl 1.2.3 on linux.
> > >
> > > Thank you.
> > >
> > > - Henry.
> > >
> > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> >
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
--
______ _/ _/_____________________________________________________
_/ _/ http://www.ebi.ac.uk/mutations/
_/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk
_/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
_/ _/ _/ Wellcome Trust Genome Campus, Hinxton
_/ _/ _/ Cambs. CB10 1SD, United Kingdom
_/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________
More information about the Bioperl-l
mailing list