[Bioperl-l] found the error tarp in load_seqdatabase.pl
snoze pa
snoze.pa at gmail.com
Thu Jan 31 18:46:24 UTC 2008
The link i sent was related to my tutorial. I was following that website.
The typical example is one of the following which have *xrefs (non-sequence
databases): line.
thanks
s
*
LOCUS P27912 792 aa linear VRL
15-JAN-2008
DEFINITION Genome polyprotein [Contains: Protein C (Core protein) (Capsid
protein); prM; Peptide pr; Small envelope protein M (Matrix
protein); Envelope protein E; Non-structural protein 1 (NS1)].
ACCESSION P27912
VERSION P27912.1 GI:130422
DBSOURCE swissprot: locus POLG_DEN1A, accession P27912;
class: standard.
created: Aug 1, 1992.
sequence updated: Aug 1, 1992.
annotation updated: Jan 15, 2008.
xrefs: D00502.1, BAA00394.1, B32401
*xrefs (non-sequence databases):* HSSP:Q88653, SMR:P27912,
GO:0005789, InterPro:IPR011999, InterPro:IPR013754,
InterPro:IPR001122, InterPro:IPR000069, InterPro:IPR001157,
InterPro:IPR002535, InterPro:IPR000336, Gene3D:G3DSA:2.60.98.10,
Gene3D:G3DSA:2.60.40.350, Pfam:PF01003, Pfam:PF02832,
Pfam:PF00869,
Pfam:PF01004, Pfam:PF00948, Pfam:PF01570
KEYWORDS Capsid protein; Cleavage on pair of basic residues; Endoplasmic
reticulum; Envelope protein; Glycoprotein; Membrane; Secreted;
Transmembrane; Viral nucleoprotein; Virion.
SOURCE Dengue virus 1 Thailand/AHF 82-80/1980
ORGANISM Dengue virus 1 Thailand/AHF 82-80/1980
Viruses; ssRNA positive-strand viruses, no DNA stage;
Flaviviridae;
Flavivirus; Dengue virus group.
REFERENCE 1 (residues 1 to 792)
AUTHORS Chu,M.C., O'Rourke,E.J. and Trent,D.W.
TITLE Genetic relatedness among structural protein genes of dengue 1
virus strains
JOURNAL J. Gen. Virol. 70 (PT 7), 1701-1712 (1989)
PUBMED 2738579
REMARK NUCLEOTIDE SEQUENCE [GENOMIC RNA].
COMMENT On May 27, 2005 this sequence version replaced gi:418950.
[FUNCTION] Protein C packages viral RNA to form a viral
nucleocapsid, and promotes virion budding (By similarity).
[FUNCTION] prM acts as a chaperone for envelope protein E during
intracellular virion assembly by masking and inactivating
envelope
protein E fusion peptide. prM is matured in the last step of
virion
assembly, presumably to avoid catastrophic activation of the
viral
fusion peptide induced by the acidic pH of the trans-Golgi
network.
After cleavage by host furin, the pr peptide is released in the
extracellular medium and small envelope protein M and envelope
protein E homodimers are dissociated (By similarity).
[FUNCTION] Envelope protein E binds cell surface receptor and is
involved in membrane fusion between virion and target cell.
Synthesized as an homodimer with prM which acts as a chaperone
for
envelope protein E. After cleavage of prM, envelope protein E
dissociate from small envelope protein M and homodimerizes (By
similarity).
[FUNCTION] Non-structural protein 1 is slowly secreted from
mammalian cells, but not from mosquito cells. Secreted form
elicits
protective immune response and plays an essential role in RNA
replication. Soluble and membrane-associated NS1 may activate
human
complement and induce host vascular leakage. This effect might
explain the clinical manifestations of dengue hemorrhagic fever
and
dengue shock syndrome (By similarity).
[SUBUNIT] prM and envelope protein E form heterodimers in the
endoplasmic reticulum and Golgi. Envelope protein E forms
homodimers. NS1 forms homodimers as well as homohexamers when
secreted. NS1 may interact with NS4A (By similarity).
[SUBCELLULAR LOCATION] Note=The virion is assembled in the
endoplasmic reticulum lumen, transported by vesicles to the
Golgi,
then transported again to the cell membrane where it is released
outside the cell.
[SUBCELLULAR LOCATION] Protein C: Virion (By similarity).
[SUBCELLULAR LOCATION] Peptide pr: Secreted (By similarity).
[SUBCELLULAR LOCATION] Small envelope protein M: Virion
membrane;
Single-pass type I membrane protein (By similarity).
[SUBCELLULAR LOCATION] Envelope protein E: Virion membrane;
Single-pass type I membrane protein (By similarity).
[SUBCELLULAR LOCATION] Non-structural protein 1: Secreted.
Endoplasmic reticulum membrane; Peripheral membrane protein;
Lumenal side (By similarity).
[DOMAIN] Transmembrane domains of the small envelope protein M
and
envelope protein E contains an endoplasmic reticulum retention
signals (By similarity).
[PTM] Specific enzymatic cleavages in vivo yield mature
proteins.
The nascent protein C contains a C-terminal hydrophobic domain
that
act as a signal sequence for translocation of prM into the lumen
of
the ER. Mature protein C is cleaved at a site upstream of this
hydrophobic domain by NS3. prM is cleaved in post-Golgi vesicles
by
a host furin, releasing the mature small envelope protein M, and
peptide pr (By similarity).
[PTM] Envelope protein E and non-structural protein 1 are
N-glycosylated (By similarity).
FEATURES Location/Qualifiers
source 1..792
/organism="Dengue virus 1 Thailand/AHF 82-80/1980"
/specific_host="Aedes aegypti (Yellowfever mosquito)"
/specific_host="Homo sapiens (Human)"
/db_xref="taxon:11057"
Protein 1..>792
/product="Genome polyprotein [Contains: Protein C"
Region 1..101
/region_name="Topological domain"
/inference="non-experimental evidence, no additional
details recorded"
/note="Cytoplasmic (Potential)."
Region 1..100
/region_name="Mature chain"
/experiment="experimental evidence, no additional
details
recorded"
/note="Protein C. /FTId=PRO_0000037884."
Region 5..114
/region_name="Flavi_capsid"
/note="Flavivirus capsid protein C. Flaviviruses are
small
enveloped viruses with virions comprised of 3 proteins
called C, M and E. Multiple copies of the C protein
form
the nucleocapsid, which contains the ssRNA molecule;
pfam01003"
/db_xref="CDD:85176"
Site 100..101
/site_type="cleavage"
/inference="non-experimental evidence, no additional
details recorded"
/note="Cleavage; by serine protease NS3 (By
similarity)."
Region 101..114
/region_name="Propeptide"
/experiment="experimental evidence, no additional
details
recorded"
/note="ER anchor for the protein C, removed in mature
form
by serine protease NS3. /FTId=PRO_0000037885."
Region 102..122
/region_name="Transmembrane region"
/inference="non-experimental evidence, no additional
details recorded"
/note="Potential."
Site 114..115
/site_type="cleavage"
/inference="non-experimental evidence, no additional
details recorded"
/note="Cleavage; by host signal peptidase (By
similarity)."
Region 115..280
/region_name="Mature chain"
/experiment="experimental evidence, no additional
details
recorded"
/note="prM. /FTId=PRO_0000264649."
Region 115..205
/region_name="Mature chain"
/experiment="experimental evidence, no additional
details
recorded"
/note="Peptide pr. /FTId=PRO_0000264650."
Region 119..204
/region_name="Flavi_propep"
/note="Flavivirus polyprotein propeptide. The
flaviviruses
are small enveloped animal viruses containing a single
positive strand genomic RNA. The genome encodes one
large
ORF a polyprotein which undergos proteolytic processing
into mature viral peptide chains; pfam01570"
/db_xref="CDD:65376"
Region 123..238
/region_name="Topological domain"
/inference="non-experimental evidence, no additional
details recorded"
/note="Extracellular (Potential)."
Site 183
/site_type="glycosylation"
/inference="non-experimental evidence, no additional
details recorded"
/note="N-linked (GlcNAc...) (Potential)."
Site 205..206
/site_type="cleavage"
/inference="non-experimental evidence, no additional
details recorded"
/note="Cleavage; by host furin (By similarity)."
Region 206..280
/region_name="Flavi_M"
/note="Flavivirus envelope glycoprotein M. Flaviviruses
are small enveloped viruses with virions comprised of 3
proteins called C, M and E. The envelope glycoprotein M
is
made as a precursor, called prM; pfam01004"
/db_xref="CDD:85177"
Region 206..280
/region_name="Mature chain"
/experiment="experimental evidence, no additional
details
recorded"
/note="Small envelope protein M. /FTId=PRO_0000037886."
Region 239..259
/region_name="Transmembrane region"
/inference="non-experimental evidence, no additional
details recorded"
/note="Potential."
Region 260..265
/region_name="Topological domain"
/inference="non-experimental evidence, no additional
details recorded"
/note="Cytoplasmic (Potential)."
Region 266..286
/region_name="Transmembrane region"
/inference="non-experimental evidence, no additional
details recorded"
/note="Potential."
Site 280..281
/site_type="cleavage"
/inference="non-experimental evidence, no additional
details recorded"
/note="Cleavage; by host signal peptidase (By
similarity)."
Region 281..775
/region_name="Mature chain"
/experiment="experimental evidence, no additional
details
recorded"
/note="Envelope protein E. /FTId=PRO_0000037887."
Region 281..576
/region_name="Flavi_glycoprot"
/note="Flavivirus glycoprotein, central and
dimerisation
domains; pfam00869"
/db_xref="CDD:85082"
Bond bond(283,310)
/bond_type="disulfide"
/inference="non-experimental evidence, no additional
details recorded"
/note="By similarity."
Region 287..725
/region_name="Topological domain"
/inference="non-experimental evidence, no additional
details recorded"
/note="Extracellular (Potential)."
Bond bond(340,401)
/bond_type="disulfide"
/inference="non-experimental evidence, no additional
details recorded"
/note="By similarity."
Site 347
/site_type="glycosylation"
/inference="non-experimental evidence, no additional
details recorded"
/note="N-linked (GlcNAc...) (Potential)."
Bond bond(354,385)
/bond_type="disulfide"
/inference="non-experimental evidence, no additional
details recorded"
/note="By similarity."
Bond bond(372,396)
/bond_type="disulfide"
/inference="non-experimental evidence, no additional
details recorded"
/note="By similarity."
Site 433
/site_type="glycosylation"
/inference="non-experimental evidence, no additional
details recorded"
/note="N-linked (GlcNAc...) (Potential)."
Bond bond(465,565)
/bond_type="disulfide"
/inference="non-experimental evidence, no additional
details recorded"
/note="By similarity."
Region 578..673
/region_name="Flavi_glycop_C"
/note="Flavivirus glycoprotein, immunoglobulin-like
domain; pfam02832"
/db_xref="CDD:66513"
Bond bond(582,613)
/bond_type="disulfide"
/inference="non-experimental evidence, no additional
details recorded"
/note="By similarity."
Region 726..746
/region_name="Transmembrane region"
/inference="non-experimental evidence, no additional
details recorded"
/note="Potential."
Region 747..752
/region_name="Topological domain"
/inference="non-experimental evidence, no additional
details recorded"
/note="Cytoplasmic (Potential)."
Region 753..773
/region_name="Transmembrane region"
/inference="non-experimental evidence, no additional
details recorded"
/note="Potential."
Region 774..>792
/region_name="Topological domain"
/inference="non-experimental evidence, no additional
details recorded"
/note="Extracellular (Potential)."
Site 775..776
/site_type="cleavage"
/inference="non-experimental evidence, no additional
details recorded"
/note="Cleavage; by host signal peptidase (By
similarity)."
Region 776..>792
/region_name="Mature chain"
/experiment="experimental evidence, no additional
details
recorded"
/note="Non-structural protein 1. /FTId=PRO_0000037888."
ORIGIN
1 mnnqrkktgn psfnmlkrar nrvstgsqla krfskgllsg qgpmklvmaf vaflrflaip
61 ptagilkrwg sfkkngainv lrgfrkeisn mlnimnrrrr svtmilmllp talafhlttr
121 ggeptlivsk qergksllfk tsagvnmctl iamdlgelce dtmtykcprm teaepddvdc
181 wcnatdtwvt ygtcsqtgeh rrdkrsvald phvglgletr tetwmssega wkqiqkvetw
241 alrhpgftvi glflahaigt sitqkgiifi llmlvtpsma mrcvgignrd fveglsgatw
301 vdvvlehgsc vttmaknkpt ldiellktev tnpavlrklc ieakisnttt dsrcptqgea
361 tlveeqdtnf vcrrtfvdrg wgngcglfgk gslitcakfk cvtklegkiv qyenlkysvi
421 vtvhtgdqhq vgnettehgt iatitpqapt seiqltdyga ltldcsprtg ldfnrvvllt
481 mkkkswlvhk qwfldlplpw tsgastsqet wnrqdllvtf ktahakkqev vvlgsqegam
541 htaltgatei qtsgtttifa ghlkcrlkmd kltlkgvsyv mctgsfklek evaetqhgtv
601 lvqvkyegtd apckipfssq dekgvtqngr litanpivid kekpvnieae ppfgesyivv
661 gagekalkls wfkkgssigk mfeatargar rmailgdtaw dfgsiggvft svgklihqif
721 gtaygvlfsg vswtmkigig illtwlglns rstslsmtci avgmvtlylg vmvqadsgcv
781 inwkgkelkc gs
//
On Jan 31, 2008 7:12 AM, Hilmar Lapp <hlapp at gmx.net> wrote:
>
> On Jan 30, 2008, at 2:30 PM, snoze pa wrote:
>
> > Hi Hilmar,
> >
> > After spending lots of time i figure out the error. I am able to load
> > sequences if the sequences do not have following entry
> >
> > xrefs (non-sequence databases):
>
> Is this the literal value? I am asking because I can't find this in
> the file at
>
> http://biopython.open-bio.org/SRC/biopython/Tests/GenBank/cor6_6.gb
>
> which you said was giving you grief. So does the genbank file above
> now load, or how can I identify the critical line in there?
>
> -hilmar
> >
> > If the Genbank sequence have this entry then script
> > load_seqdatabase.pl is
> > crashing. I try it in couple of sequences and found it is the
> > culprit line
> > genbank format. But this line is important as it contain lots of
> > information... so I am wondering how to solve this problem
> >
> > Any help?
> >
> > Thanks in advance
> > s
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
>
More information about the Bioperl-l
mailing list