[Bioperl-l] One more load_seqdatabase.pl question
Hilmar Lapp
hlapp at gmx.net
Wed Nov 29 04:54:53 UTC 2006
These are the protein translations stored in the feature table as
tags of features, right?
You can change the type of the column (although there may be some
issues when you update the column because the NVL() clause won't work
if I recall that correctly), but doing so will deprive you of any
'normal' searches against that column. (You can still use functions
from the DBMS_LOB package, but they will be much slower and are
completely non-standard.)
It is up to you whether that is too big of a price to pay for having
some redundant protein translations (translating the feature's DNA
sequence should give you the same) in the database. I always trimmed
those feature tags off (using a custom SeqProcessor). An alternative
is to convert these feature tags into actual bioentries (i.e.,
Bio::Seq objects; again, a custom SeqProcessor will allow you to do
that).
-hilmar
On Nov 28, 2006, at 4:13 PM, gang wu wrote:
> Hi everyone,
>
> I'm using load_seqdatabase.pl to upload some Genbank genome
> sequences to
> my Oracle BioSQL database. I saw some errors(See attached warning
> message) related to seqfeature_qualifier_value
> (SG_SEQFEATURE_QUALIFIER_ASSOC.VALUE column), which has Varchar2 data
> type of maximum 4000 bytes. Did anybody mention this issue before?
> Should I just modify the column to a type being able store more data
> such as LONG or CLOB?
>
> Thanks.
>
> Gang
>
>
> Log information:
> ============================================
>
> load_seqdatabase.pl -host elegans -driver Oracle -dbname sparc -dbuser
> biosqldb-sgowner -dbpass PASS -format genbank -namespace genbank
> /genomeseq/arabidopsis//NC_003070.gbk
>
>
> Loading /genomeseq/arabidopsis//NC_003070.gbk ...
>
>
> -------------------- WARNING ---------------------
> MSG: SimpleValueAdaptor::add_assoc: unexpected failure of statement
> execution: ORA-01461: can bind a LONG value only for insert into a
> LONG
> column (DBD ERROR: error possibly near <*> indicator at char 12 in
> 'INSERT INTO <*>seqfeature_qualifier_value (fea_oid, trm_oid, value,
> rank) VALUES (:p1, :p2, :p3, :p4)')
> name: INSERT ASSOC [2]
> Bio::SeqFeature::Generic;Bio::Annotation::SimpleValue
> values: FK[Bio::SeqFeature::Generic]:14898,
> FK[Bio::Annotation::SimpleValue]:800,
> value:"MVAVTGEVLHLLRRYLGEYVHGLSTEALRISVWKGDVVLKDLKLKAEALNSLKLPVAVKSGFV
> GTITLKVPWKSLGKEPVIVLIDRVFVLAYPAPDDRTLKFFTLVGTEFAYTNYIPGGRQGKASRNQASADR
> GTSYFWLMELHGYEAETATLEARAKSKLGSPPQGNSWLGSIIATIIGNLKVSISNVHIRYEDSTRDSSEI
> LASFFSYFNNICSSNPGHPFAAGITLAKLAAVTMDEEGNETFDTSGALDKLRKSLQLERLALYHDSNSFP
> WEIEKQWDNITPEEWIEMFEDGIKEQTEHKIKSKWALNRHYLLSPINGSLKYHRLGNQERNNPEIPFERA
> SVILNDVNVTITEEQYHDWIKLVEVVSRYKTYIEISHLRPMVPVSEAPRLWWRFAAQASLQQKRLWYTRY
> IQLYANFLQQSSDVNYPEMREIEKDLDSKVILLWRLLAHAKVESVKSKEAAEQRKLKKGGWFSFNWRTEA
> EDDPEVDSVAGGSKLMEERLTKDEWKAINKLLSHQPDEEMNLYSGKDMQNMTHFLVTVSIGQGAARIVDI
> NQTEVLCGRFEQLDVTTKFRHRSTQCDVSLRFYGLSAPEGSLAQSVSSERKTNALMASFVNAPIGENIDW
> RLSATISPCHATIWTESYDRVLEFVKRSNAVSPTVALETAAVLQMKLEEVTRRAQEQLQIVLEEQSRFAL
> DIDIDAPKVRIPLRASGSSKCSSHFLLDFGNFTLTTMDTRSEEQRQNLYSRFCISGRDIAAFFTDCGSDN
> QGCSLVMEDFTNQPILSPILEKADNVYSLIDRCGMAVIVDQIKVPHPSYPSTRISIQVPNIGVHFSPTRY
> MRIMQLFDILYGAMKTYSQAPVDHMPDGIQPWSPTDLASDARILVWKGIGNSVATWQSCRLVLSGLYLYT
> FESEKSLDYQRYLCMAGRQVFEVPPANIGGSPYCLAVGVRGTDLKKALESSSTWIIEFQGEEKAAWLRGL
> VQATYQASA!
>
> PLSGDVLGQTSDGDGDFHEPQTRNMKAADLVITGALVETKLYLYGKIKNECDEQVEEVLLLKVLASGGKV
> HLISSESGLTVRTKLHSLKIKDELQQQQSGSAQYLAYSVLKNEDIQESLGTCDSFDKEMPVGHADDEDAY
> TDALPEFLSPTEPGTPDMDMIQCSMMMDSDEHVGLEDTEGGFHEKDTSQGKSLCDEVFYEVQGGEFSDFV
> SVVFLTRSSSSHDYNGIDTQMSIRMSKLEFFCSRPTVVALIGFGFDLSTASYIENDKDANTLVPEKSDSE
> KETNDESGRIEGLLGYGKDRVVFYLNMNVDNVTVFLNKEDGSQLAMFVQERFVLDIKVHPSSLSVEGTLG
> NFKLCDKSLDSGNCWSWLCDIRDPGVESLIKFKFSSYSAGDDDYEGYDYSLSGKLSAVRIVFLYRFVQEV
> TAYFMGLATPHSEEVIKLVDKVGGFEWLIQKDEMDGATAVKLDLSLDTPIIVVPRDSLSKDYIQLDLGQL
> EVSNEISWHGCPEKDATAVRVDVLHAKILGLNMSVGINGSIGKPMIREGQGLDIFVRRSLRDVFKKVPTL
> SVEVKIDFLHAVMSDKEYDIIVSCTSMNLFEEPKLPPDFRGSSSGPKAKMRLLADKVNLNSQMIMSRTVT
> ILAVDINYALLELRNSVNEESSLAHVAVRASEPNSSISWMTSLSETDLYVSVPKVSVLDIRPNTKPEMRL
> MLGSSVDASKQASSESLPFSLNKGSFKRANSRAVLDFDAPCSTMLLMDYRWRASSQSCVLRVQQPRILAV
> PDFLLAVGEFFVPALRAITGRDETLDPTNDPITRSRGIVLSEPLYKQTEDVVHLSPRRQLVADSLGIDEY
> TYDGCGKVISLSEQGEKDLNVGRLEPIIIVGHGKKLRFVNVKIKNGSLLSKCIYLSNDSSCLFSPEDGVD
> ISMLENASSNPENVLSNAHKSSDVSDTCQYDSKSGQSFTFEAQVVSPEFTFFDGTKSSLDDSSAVEKLLR
> VKLDFNFM!
>
> YASKEKDIWVRALLKNLVVETGSGLIILDPVDISGGYTSVKEKTNMSLTSTDIYMHLSLSALSLLLNLQS
> QVTGALQSGNAIPLASCTNFDRIWVSPKENGPRNNLTIWRPQAPSNYVILGDCVTSRAIPPTQAVMAVSN
> TYGRVRKPIGFNRIGLFSVIQGLEGDNVQHSHNSNECSLWMPVAPVGYTAMGCVANIGSEQPPDHIVYCL
> SIWRADNVLGAFYAHTSTAAPSKKYSPGLSHCLLWNPLQSKTSSSSDPSSTSGSRSEQSSDQTGNSSGWD
> ILRSISKATSYHVSTPNFERIWWDKGGDLRRPVSIWRPVPRPGFAILGDSITEGLEPPALGILFKADDSE
> IAAKPVQFNKVAHIVGKGFDEVFCWFPVAPPGYVSLGCVLSKFDEAPHVDSFCCPRIDLVNQANIYEASV
> TRSSSSKSSQLWSIWKVDNQACTFLARSDLKRPPSRMAFAVGESVKPKTQENVNAEIKLRCFSLTLLDGL
> HGMMTPLFDTTVTNIKLATHGRPEAMNAVLISSIAASTFNPQLEAWEPLLEPFDGIFKLETYDTALNQSS
> KPGKRLRIAATNILNINVSAANLETLGDAVVSWRRQLELEERAAKMKEESAASRESGDLSAFSALDEDDF
> QTIVVENKLGRDIYLKKLEENSDVVVKLCHDENTSVWVPPPRFSNRLNVADSSREARNYMTVQILEAKGL
> HIIDDGNSHSFFCTLRLVVDSQGAEPQKLFPQSARTKCVKPSTTIVNDLMECTSKWNELFIFEIPRKGVA
> RLEVEVTNLAAKAGKGEVVGSLSFPVGHGESTLRKVASVRMLHQSSDAENISSYTLQRKNAEDKHDNGCL
> LISTSYFEKTTIPNTLRNMESKDFVDGDTGFWIGVRPDDSWHSIRSLLPLCIAPKSLQNDFIAMEVSMRN
> GRKHATFRCLATVVNDSDVNLEISISSDQNVSSGVSNHNAVIASRSSYVLPWGCLSKDNEQCLHIRPKVE
> NSHHSYAWGYCIAVSSGCGKDQPFVDQGLLTRQNTIKQSSRASTFFLRLNQLEKKDMLFCCQPSTGSKPL
> WLSVGADAS!
>
> VLHTDLNTPVYDWKISISSPLKLENRLPCPVKFTVWEKTKEGTYLERQHGVVSSRKSAHVYSADIQRPVY
> LTLAVHGGWALEKDPIPVLDISSNDSVSSFWFVHQQSKRRLRVSIERDVGETGAAPKTIRFFVPYWITND
> SYLPLSYRVVEIEPSENVEAGSPCLTRASKSFKKNPVFSMERRHQKKNVRVLESIEDTSPMPSMLSPQES
> AGRSGVVLFPSQKDSYVSPRIGIAVAARDSDSYSPGISLLELEKKERIDVKAFCKDASYYMLSAVLNMTS
> DRTKVIHLQPHTLFINRVGVSICLQQCDCQTEEWINPSDPPKLFGWQSSTRLELLKLRVKGYRWSTPFSV
> FSEGTMRVPVPKEDGTDQLQLRVQVRSGTKNSRYEVIFRPNSISGPYRIENRSMFLPIRYRQVEGVSESW
> QFLPPNAAASFYWENLGRRHLFELLVDGNDPSNSEKFDIDKIGDYPPRSESGPTRPIRVTILKEDKKNIV
> RISDWMPAIEPTSSISRRLPASSLSELSGNESQQSHLLASEDSEFHVIVELAELGISVIDHAPEEILYMS
> VQNLFVAYSTGLGSGLSRFKLRMQGIQVDNQLPLAPMPVLFRPQRTGDKADYILKFSVTLQSNAGLDLRV
> YPYIDFQGRENTAFLINIHEPIIWRIHEMIQQANLSRLSDPNSTAVSVDPFIQIGVLNFSEVRFRVSMAM
> SPSQRPRGVLGFWSSLMTALGNTENMPVRISERFHENISMRQSTMINNAIRNVKKDLLGQPLQLLSGVDI
> LGNASSALGHMSQGIAALSMDKKFIQSRQRQENKGVEDFGDIIREGGGALAKGLFRGVTGILTKPLEGAK
> SSGVEGFVSGFGKGIIGAAAQPVSGVLDLLSKTTEGANAMRMKIAAAITSDEQLLRRRLPRAVGADSLLR
> PYNDYRAQGQVILQLAESGSFLGQVDLFKVRGKFALTDAYESHFILPKGKVLMITHRRVILLQQPSNIMG
> QRKFIPAK!
>
> DACSIQWDILWNDLVTMELSDGKKDPPNSPPSRLILYLKAKPHDPKEQFRVVKCIPNSKQAFDVYSAIDQ
> AINLYGQNALKGMVKNKVTRPYSPISESSWAEGASQQMPASVTPSSTFGTSPTTSSS",
> rank:"1"
> --------------------------------------------------
>
>
> =============================================
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================
More information about the Bioperl-l
mailing list