[Bioperl-l] One more load_seqdatabase.pl question
gang wu
gwu at molbio.mgh.harvard.edu
Thu Nov 30 22:08:08 UTC 2006
Thanks Hilmar. Do you mean the NVL() clause will make
load_seqdatabase.pl not work when update?
I have problem with updating. Seems load_seqdatabase.pl only tries to
insert instead of update. I used one of the test genbank file coming
whith bioperl-db. Please take a look at the attached output.
Thanks.
Gang
=========================================
>perl load_seqdatabase.pl -lookup -host elegans -driver Oracle -dbname
sparc -dbuser biosqldb-sgowner -dbpass PASS -format genbank -namespace
test /root/.cpan/build/bioperl-db-1.5.2-RC3/scripts/biosql/data/AP000868.gb
Loading
/root/.cpan/build/bioperl-db-1.5.2-RC3/scripts/biosql/data/AP000868.gb ...
-------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::CommentAdaptor (driver) failed, values
were ("This sequence was reannotated via the Ensembl system. Please
visit the Ensembl web site, http://www.ensembl.org/ for more
information. ","1") FKs (389109)
ORA-00001: unique constraint (BIOSQLDB_SGOWNER.XAK1COMMENT) violated
(DBD ERROR: OCIStmtExecute)
---------------------------------------------------
-------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::CommentAdaptor (driver) failed, values
were ("The /gene indicates a unique id for a gene, /cds a unique id for
a translation and a /exon a unique id for an exon. These ids are
maintained wherever possible between versions. For more information on
how to interpret the feature table, please visit
http://www.ensembl.org/Docs/embl.html. ","2") FKs (389109)
ORA-00001: unique constraint (BIOSQLDB_SGOWNER.XAK1COMMENT) violated
(DBD ERROR: OCIStmtExecute)
---------------------------------------------------
...
...
==========================================================
Hilmar Lapp wrote:
> These are the protein translations stored in the feature table as
> tags of features, right?
>
> You can change the type of the column (although there may be some
> issues when you update the column because the NVL() clause won't work
> if I recall that correctly), but doing so will deprive you of any
> 'normal' searches against that column. (You can still use functions
> from the DBMS_LOB package, but they will be much slower and are
> completely non-standard.)
>
> It is up to you whether that is too big of a price to pay for having
> some redundant protein translations (translating the feature's DNA
> sequence should give you the same) in the database. I always trimmed
> those feature tags off (using a custom SeqProcessor). An alternative
> is to convert these feature tags into actual bioentries (i.e.,
> Bio::Seq objects; again, a custom SeqProcessor will allow you to do
> that).
>
> -hilmar
>
> On Nov 28, 2006, at 4:13 PM, gang wu wrote:
>
>
>> Hi everyone,
>>
>> I'm using load_seqdatabase.pl to upload some Genbank genome
>> sequences to
>> my Oracle BioSQL database. I saw some errors(See attached warning
>> message) related to seqfeature_qualifier_value
>> (SG_SEQFEATURE_QUALIFIER_ASSOC.VALUE column), which has Varchar2 data
>> type of maximum 4000 bytes. Did anybody mention this issue before?
>> Should I just modify the column to a type being able store more data
>> such as LONG or CLOB?
>>
>> Thanks.
>>
>> Gang
>>
>>
>> Log information:
>> ============================================
>>
>> load_seqdatabase.pl -host elegans -driver Oracle -dbname sparc -dbuser
>> biosqldb-sgowner -dbpass PASS -format genbank -namespace genbank
>> /genomeseq/arabidopsis//NC_003070.gbk
>>
>>
>> Loading /genomeseq/arabidopsis//NC_003070.gbk ...
>>
>>
>> -------------------- WARNING ---------------------
>> MSG: SimpleValueAdaptor::add_assoc: unexpected failure of statement
>> execution: ORA-01461: can bind a LONG value only for insert into a
>> LONG
>> column (DBD ERROR: error possibly near <*> indicator at char 12 in
>> 'INSERT INTO <*>seqfeature_qualifier_value (fea_oid, trm_oid, value,
>> rank) VALUES (:p1, :p2, :p3, :p4)')
>> name: INSERT ASSOC [2]
>> Bio::SeqFeature::Generic;Bio::Annotation::SimpleValue
>> values: FK[Bio::SeqFeature::Generic]:14898,
>> FK[Bio::Annotation::SimpleValue]:800,
>> value:"MVAVTGEVLHLLRRYLGEYVHGLSTEALRISVWKGDVVLKDLKLKAEALNSLKLPVAVKSGFV
>> GTITLKVPWKSLGKEPVIVLIDRVFVLAYPAPDDRTLKFFTLVGTEFAYTNYIPGGRQGKASRNQASADR
>> GTSYFWLMELHGYEAETATLEARAKSKLGSPPQGNSWLGSIIATIIGNLKVSISNVHIRYEDSTRDSSEI
>> LASFFSYFNNICSSNPGHPFAAGITLAKLAAVTMDEEGNETFDTSGALDKLRKSLQLERLALYHDSNSFP
>> WEIEKQWDNITPEEWIEMFEDGIKEQTEHKIKSKWALNRHYLLSPINGSLKYHRLGNQERNNPEIPFERA
>> SVILNDVNVTITEEQYHDWIKLVEVVSRYKTYIEISHLRPMVPVSEAPRLWWRFAAQASLQQKRLWYTRY
>> IQLYANFLQQSSDVNYPEMREIEKDLDSKVILLWRLLAHAKVESVKSKEAAEQRKLKKGGWFSFNWRTEA
>> EDDPEVDSVAGGSKLMEERLTKDEWKAINKLLSHQPDEEMNLYSGKDMQNMTHFLVTVSIGQGAARIVDI
>> NQTEVLCGRFEQLDVTTKFRHRSTQCDVSLRFYGLSAPEGSLAQSVSSERKTNALMASFVNAPIGENIDW
>> RLSATISPCHATIWTESYDRVLEFVKRSNAVSPTVALETAAVLQMKLEEVTRRAQEQLQIVLEEQSRFAL
>> DIDIDAPKVRIPLRASGSSKCSSHFLLDFGNFTLTTMDTRSEEQRQNLYSRFCISGRDIAAFFTDCGSDN
>> QGCSLVMEDFTNQPILSPILEKADNVYSLIDRCGMAVIVDQIKVPHPSYPSTRISIQVPNIGVHFSPTRY
>> MRIMQLFDILYGAMKTYSQAPVDHMPDGIQPWSPTDLASDARILVWKGIGNSVATWQSCRLVLSGLYLYT
>> FESEKSLDYQRYLCMAGRQVFEVPPANIGGSPYCLAVGVRGTDLKKALESSSTWIIEFQGEEKAAWLRGL
>> VQATYQASA!
>>
>> PLSGDVLGQTSDGDGDFHEPQTRNMKAADLVITGALVETKLYLYGKIKNECDEQVEEVLLLKVLASGGKV
>> HLISSESGLTVRTKLHSLKIKDELQQQQSGSAQYLAYSVLKNEDIQESLGTCDSFDKEMPVGHADDEDAY
>> TDALPEFLSPTEPGTPDMDMIQCSMMMDSDEHVGLEDTEGGFHEKDTSQGKSLCDEVFYEVQGGEFSDFV
>> SVVFLTRSSSSHDYNGIDTQMSIRMSKLEFFCSRPTVVALIGFGFDLSTASYIENDKDANTLVPEKSDSE
>> KETNDESGRIEGLLGYGKDRVVFYLNMNVDNVTVFLNKEDGSQLAMFVQERFVLDIKVHPSSLSVEGTLG
>> NFKLCDKSLDSGNCWSWLCDIRDPGVESLIKFKFSSYSAGDDDYEGYDYSLSGKLSAVRIVFLYRFVQEV
>> TAYFMGLATPHSEEVIKLVDKVGGFEWLIQKDEMDGATAVKLDLSLDTPIIVVPRDSLSKDYIQLDLGQL
>> EVSNEISWHGCPEKDATAVRVDVLHAKILGLNMSVGINGSIGKPMIREGQGLDIFVRRSLRDVFKKVPTL
>> SVEVKIDFLHAVMSDKEYDIIVSCTSMNLFEEPKLPPDFRGSSSGPKAKMRLLADKVNLNSQMIMSRTVT
>> ILAVDINYALLELRNSVNEESSLAHVAVRASEPNSSISWMTSLSETDLYVSVPKVSVLDIRPNTKPEMRL
>> MLGSSVDASKQASSESLPFSLNKGSFKRANSRAVLDFDAPCSTMLLMDYRWRASSQSCVLRVQQPRILAV
>> PDFLLAVGEFFVPALRAITGRDETLDPTNDPITRSRGIVLSEPLYKQTEDVVHLSPRRQLVADSLGIDEY
>> TYDGCGKVISLSEQGEKDLNVGRLEPIIIVGHGKKLRFVNVKIKNGSLLSKCIYLSNDSSCLFSPEDGVD
>> ISMLENASSNPENVLSNAHKSSDVSDTCQYDSKSGQSFTFEAQVVSPEFTFFDGTKSSLDDSSAVEKLLR
>> VKLDFNFM!
>>
>> YASKEKDIWVRALLKNLVVETGSGLIILDPVDISGGYTSVKEKTNMSLTSTDIYMHLSLSALSLLLNLQS
>> QVTGALQSGNAIPLASCTNFDRIWVSPKENGPRNNLTIWRPQAPSNYVILGDCVTSRAIPPTQAVMAVSN
>> TYGRVRKPIGFNRIGLFSVIQGLEGDNVQHSHNSNECSLWMPVAPVGYTAMGCVANIGSEQPPDHIVYCL
>> SIWRADNVLGAFYAHTSTAAPSKKYSPGLSHCLLWNPLQSKTSSSSDPSSTSGSRSEQSSDQTGNSSGWD
>> ILRSISKATSYHVSTPNFERIWWDKGGDLRRPVSIWRPVPRPGFAILGDSITEGLEPPALGILFKADDSE
>> IAAKPVQFNKVAHIVGKGFDEVFCWFPVAPPGYVSLGCVLSKFDEAPHVDSFCCPRIDLVNQANIYEASV
>> TRSSSSKSSQLWSIWKVDNQACTFLARSDLKRPPSRMAFAVGESVKPKTQENVNAEIKLRCFSLTLLDGL
>> HGMMTPLFDTTVTNIKLATHGRPEAMNAVLISSIAASTFNPQLEAWEPLLEPFDGIFKLETYDTALNQSS
>> KPGKRLRIAATNILNINVSAANLETLGDAVVSWRRQLELEERAAKMKEESAASRESGDLSAFSALDEDDF
>> QTIVVENKLGRDIYLKKLEENSDVVVKLCHDENTSVWVPPPRFSNRLNVADSSREARNYMTVQILEAKGL
>> HIIDDGNSHSFFCTLRLVVDSQGAEPQKLFPQSARTKCVKPSTTIVNDLMECTSKWNELFIFEIPRKGVA
>> RLEVEVTNLAAKAGKGEVVGSLSFPVGHGESTLRKVASVRMLHQSSDAENISSYTLQRKNAEDKHDNGCL
>> LISTSYFEKTTIPNTLRNMESKDFVDGDTGFWIGVRPDDSWHSIRSLLPLCIAPKSLQNDFIAMEVSMRN
>> GRKHATFRCLATVVNDSDVNLEISISSDQNVSSGVSNHNAVIASRSSYVLPWGCLSKDNEQCLHIRPKVE
>> NSHHSYAWGYCIAVSSGCGKDQPFVDQGLLTRQNTIKQSSRASTFFLRLNQLEKKDMLFCCQPSTGSKPL
>> WLSVGADAS!
>>
>> VLHTDLNTPVYDWKISISSPLKLENRLPCPVKFTVWEKTKEGTYLERQHGVVSSRKSAHVYSADIQRPVY
>> LTLAVHGGWALEKDPIPVLDISSNDSVSSFWFVHQQSKRRLRVSIERDVGETGAAPKTIRFFVPYWITND
>> SYLPLSYRVVEIEPSENVEAGSPCLTRASKSFKKNPVFSMERRHQKKNVRVLESIEDTSPMPSMLSPQES
>> AGRSGVVLFPSQKDSYVSPRIGIAVAARDSDSYSPGISLLELEKKERIDVKAFCKDASYYMLSAVLNMTS
>> DRTKVIHLQPHTLFINRVGVSICLQQCDCQTEEWINPSDPPKLFGWQSSTRLELLKLRVKGYRWSTPFSV
>> FSEGTMRVPVPKEDGTDQLQLRVQVRSGTKNSRYEVIFRPNSISGPYRIENRSMFLPIRYRQVEGVSESW
>> QFLPPNAAASFYWENLGRRHLFELLVDGNDPSNSEKFDIDKIGDYPPRSESGPTRPIRVTILKEDKKNIV
>> RISDWMPAIEPTSSISRRLPASSLSELSGNESQQSHLLASEDSEFHVIVELAELGISVIDHAPEEILYMS
>> VQNLFVAYSTGLGSGLSRFKLRMQGIQVDNQLPLAPMPVLFRPQRTGDKADYILKFSVTLQSNAGLDLRV
>> YPYIDFQGRENTAFLINIHEPIIWRIHEMIQQANLSRLSDPNSTAVSVDPFIQIGVLNFSEVRFRVSMAM
>> SPSQRPRGVLGFWSSLMTALGNTENMPVRISERFHENISMRQSTMINNAIRNVKKDLLGQPLQLLSGVDI
>> LGNASSALGHMSQGIAALSMDKKFIQSRQRQENKGVEDFGDIIREGGGALAKGLFRGVTGILTKPLEGAK
>> SSGVEGFVSGFGKGIIGAAAQPVSGVLDLLSKTTEGANAMRMKIAAAITSDEQLLRRRLPRAVGADSLLR
>> PYNDYRAQGQVILQLAESGSFLGQVDLFKVRGKFALTDAYESHFILPKGKVLMITHRRVILLQQPSNIMG
>> QRKFIPAK!
>>
>> DACSIQWDILWNDLVTMELSDGKKDPPNSPPSRLILYLKAKPHDPKEQFRVVKCIPNSKQAFDVYSAIDQ
>> AINLYGQNALKGMVKNKVTRPYSPISESSWAEGASQQMPASVTPSSTFGTSPTTSSS",
>> rank:"1"
>> --------------------------------------------------
>>
>>
>> =============================================
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
More information about the Bioperl-l
mailing list