MOTIFS from: swissprot:vav_human Mismatches: 0 July 29, 1999 20:55 .. VAV_HUMAN Check: 4177 Length: 846 ! P15498 VAV PROTO-ONCOGENE. 7/98 ______________________________________________________________________________ Asn_Glycosylation N~(P)(S,T)~(P) N~P(T)~P 377: EVKRD NETL RQITN N~P(T)~P 510: NIYPE NATA NGHDF N~P(S)~P 641: WWEGR NTST NEIGW N~P(S)~P 687: ESILA NRSD GTFLV ************************ * N-glycosylation site * ************************ It has been known for a long time [1] that potential N-glycosylation sites are specific to the consensus sequence Asn-Xaa-Ser/Thr. It must be noted that the presence of the consensus tripeptide is not sufficient to conclude that an asparagine residue is glycosylated, due to the fact that the folding of the protein plays an important role in the regulation of N-glycosylation [2]. It has been shown [3] that the presence of proline between Asn and Ser/Thr will inhibit N-glycosylation; this has been confirmed by a recent [4] statistical analysis of glycosylation sites, which also shows that about 50% of the sites that have a proline C-terminal to Ser/Thr are not glycosylated. It must also be noted that there are a few reported cases of glycosylation sites with the pattern Asn-Xaa-Cys; an experimentally demonstrated occurrence of such a non-standard site is found in the plasma protein C [5]. -Consensus pattern: N-{P}-[ST]-{P} [N is the glycosylation site] -Last update: May 1991 / Text revised. [ 1] Marshall R.D. Annu. Rev. Biochem. 41:673-702(1972). [ 2] Pless D.D., Lennarz W.J. Proc. Natl. Acad. Sci. U.S.A. 74:134-138(1977). [ 3] Bause E. Biochem. J. 209:331-336(1983). [ 4] Gavel Y., von Heijne G. Protein Eng. 3:433-442(1990). [ 5] Miletich J.P., Broze G.J. Jr. J. Biol. Chem. 265:11397-11404(1990). ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ______________________________________________________________________________ Camp_Phospho_Site (R,K)2x(S,T) (K){2}x(S) 465: GDRDN KKWS HMFLL **************************************************************** * cAMP- and cGMP-dependent protein kinase phosphorylation site * **************************************************************** There has been a number of studies relative to the specificity of cAMP- and cGMP-dependent protein kinases [1,2,3]. Both types of kinases appear to share a preference for the phosphorylation of serine or threonine residues found close to at least two consecutive N-terminal basic residues. It is important to note that there are quite a number of exceptions to this rule. -Consensus pattern: [RK](2)-x-[ST] [S or T is the phosphorylation site] -Last update: June 1988 / First entry. [ 1] Fremisco J.R., Glass D.B., Krebs E.G. J. Biol. Chem. 255:4240-4245(1980). [ 2] Glass D.B., Smith S.B. J. Biol. Chem. 258:14797-14803(1983). [ 3] Glass D.B., El-Maghrabi M.R., Pilkis S.J. J. Biol. Chem. 261:2987-2993(1986). ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ______________________________________________________________________________ Ck2_Phospho_Site (S,T)x2(D,E) (T)x{2}(E) 81: RTFLS TCCE KFGLK (T)x{2}(E) 131: IMPFP TEEE SVGDE (S)x{2}(D) 135: PTEEE SVGD EDIYS (T)x{2}(E) 152: DQIDD TVEE DEDLY (T)x{2}(D) 190: MPPKM TEYD KRCCC (S)x{2}(E) 285: YGRYC SQVE SASKH (T)x{2}(D) 321: NNGRF TLRD LLMVP (T)x{2}(E) 412: GELKI TSVE RRSKM (S)x{2}(D) 418: SVERR SKMD RYAFL (S)x{2}(D) 458: QVRDD SSGD RDNKK (S)x{2}(E) 522: DFQMF SFEE TTSCK (T)x{2}(E) 628: DIVEL TKAE AEQNW (S)x{2}(E) 643: EGRNT STNE IGWFP (S)x{2}(D) 750: FYQQN SLKD CFKSL (S)x{2}(E) 803: DRSEL SLKE GDIIK ***************************************** * Casein kinase II phosphorylation site * ***************************************** Casein kinase II (CK-2) is a protein serine/threonine kinase whose activity is independent of cyclic nucleotides and calcium. CK-2 phosphorylates many different proteins. The substrate specificity [1] of this enzyme can be summarized as follows: (1) Under comparable conditions Ser is favored over Thr. (2) An acidic residue (either Asp or Glu) must be present three residues from the C-terminal of the phosphate acceptor site. (3) Additional acidic residues in positions +1, +2, +4, and +5 increase the phosphorylation rate. Most physiological substrates have at least one acidic residue in these positions. (4) Asp is preferred to Glu as the provider of acidic determinants. (5) A basic residue at the N-terminal of the acceptor site decreases the phosphorylation rate, while an acidic one will increase it. -Consensus pattern: [ST]-x(2)-[DE] [S or T is the phosphorylation site] -Note: this pattern is found in most of the known physiological substrates. -Last update: May 1991 / Text revised. [ 1] Pinna L.A. Biochim. Biophys. Acta 1054:267-284(1990). ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ______________________________________________________________________________ Dag_Pe_Binding_Domain Hx(L,I,V,M,F,Y,W)x{8,11}Cx2Cx3(L,I,V,M,F,C)x{5,10}Cx2Cx4(H,D)x2Cx{5,9}C Hx(F)x{10}Cx{2}Cx{3}(L)x{9}Cx{2}Cx{4}(H)x{2}Cx{6}C 516: ATANG HDFQMFSFEETTSCKACQMLLRGTFYQGYRCHRCRASAHKECLGRVPPC GRHGQ ************************************************** * Phorbol esters / diacylglycerol binding domain * ************************************************** Diacylglycerol (DAG) is an important second messenger. Phorbol esters (PE) are analogues of DAG and potent tumor promoters that cause a variety of physiological changes when administered to both cells and tissues. DAG activates a family of serine/threonine protein kinases, collectively known as protein kinase C (PKC) [1]. Phorbol esters can directly stimulate PKC. The N- terminal region of PKC, known as C1, has been shown [2] to bind PE and DAG in a phospholipid and zinc-dependent fashion. The C1 region contains one or two copies (depending on the isozyme of PKC) of a cysteine-rich domain about 50 amino-acid residues long and essential for DAG/PE-binding. Such a domain has also been found in the following proteins: - Diacylglycerol kinase (EC 2.7.1.107) (DGK) [3], the enzyme that converts DAG into phosphatidate. It contains two copies of the DAG/PE-binding domain in its N-terminal section. At least five different forms of DGK are known in mammals. - N-chimaerin. A brain specific protein which shows sequence similarities with the BCR protein at its C-terminal part and contains a single copy of the DAG/PE-binding domain at its N-terminal part. It has been shown [4,5] to be able to bind phorbol esters. - The raf/mil family of serine/threonine protein kinases. These protein kinases contain a single N-terminal copy of the DAG/PE-binding domain. - The unc-13 protein from Caenorhabditis elegans. Its function is not known but it contains a copy of the DAG/PE-binding domain in its central section and has been shown to bind specifically to a phorbol ester in the presence of calcium [6]. - The vav oncogene. Vav was generated by a genetic rearrangement during gene transfer assays. Its expression seems to be restricted to cells of hematopoeitic origin. Vav seems [5,7] to contain a DAG/PE-binding domain in the central part of the protein. - The Drosophila GTPase activating protein rotund. The DAG/PE-binding domain binds two zinc ions; the ligands of these metal ions are probably the six cysteines and two histidines that are conserved in this domain. We have developed a signature pattern that spans completely the DAG/PE domain. -Consensus pattern: H-x-[LIVMFYW]-x(8,11)-C-x(2)-C-x(3)-[LIVMFC]-x(5,10)- C-x(2)-C-x(4)-[HD]-x(2)-C-x(5,9)-C [All the C and H are probably involved in binding Zinc] -Sequences known to belong to this class detected by the pattern: ALL, except a few DGK's. -Other sequence(s) detected in SWISS-PROT: NONE. -Last update: November 1997 / Pattern and text revised. [ 1] Azzi A., Boscoboinik D., Hensey C. Eur. J. Biochem. 208:547-557(1992). [ 2] Ono Y., Fujii T., Igarashi K., Kuno T., Tanaka C, Kikkawa U., Nishizuka Y. Proc. Natl. Acad. Sci. U.S.A. 86:4868-4871(1989). [ 3] Sakane F., Yamada K., Kanoh H., Yokoyama C., Tanabe T. Nature 344:345-348(1990). [ 4] Ahmed S., Kozma R., Monfries C., Hall C., Lim H.H., Smith P., Lim L. Biochem. J. 272:767-773(1990). [ 5] Ahmed S., Kozma R., Lee J., Monfries C., Harden N., Lim L. Biochem. J. 280:233-241(1991). [ 6] Ahmed S., Maruyama I.N., Kozma R., Lee J., Brenner S., Lim L. Biochem. J. 287:995-999(1992). [ 7] Boguski M.S., Bairoch A., Attwood T.K., Michaels G.S. Nature 358:113-113(1992). ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ______________________________________________________________________________ Gds_Cdc24 Lx2(L,I,V,M,F,Y,W)Lx2P(L,I,V,M)x2(L,I,V,M)x(K,R,S)x2Lx(L,I,V,M)x(D,E,Q)(L,I,V,M)x3(S,T) Lx{2}(L)Lx{2}P(M)x{2}(V)x(K)x{2}Lx(L)x(E)(L)x{3}(T) 322: NGRFT LRDLLMVPMQRVLKYHLLLQELVKHT QEAME ********************************************************************** * Guanine-nucleotide dissociation stimulators CDC24 family signature * ********************************************************************** Ras proteins are membrane-associated molecular switches that bind GTP and GDP and slowly hydrolyze GTP to GDP [1]. The balance between the GTP bound (active) and GDP bound (inactive) states is regulated by the opposite action of proteins activating the GTPase activity and that of proteins which promote the loss of bound GDP and the uptake of fresh GTP [2,3]. The latter proteins are known as guanine-nucleotide dissociation stimulators (GDSs) (or also as guanine-nucleotide releasing (or exchange) factors (GRFs)). Proteins that act as GDS can be classified into at least two families, on the basis of sequence similarities. One of these families is currently known to group the proteins listed below (references are only provided for recently determined sequences): - CDC24 from yeast. CDC24 is a GDS that acts on the ras-like protein CDC42. - Dbl (or mcf-2) oncogene from mammals. Dbl is a GDS for a ras-like protein known as G25K or CDC42Hs. - p140-RAS GRF (cdc25Mm) from mammals. This protein, a GDS for ras, possesses both a domain belonging to the CDC24 family and one belonging to the CDC25 family. - Bcr oncogene from mammals. Bcr can form a chimera with the abl protein and then cause chronic myelogenous leukemia (CML). Bcr acts on p21-rac proteins. - Oncogene vav from mammals. The target of this protein is not yet known. - Oncogene ect2 from mouse [4]. The target of this protein is not yet known. - scd1 from fission yeast. The size of these proteins range from 736 residues (CDC42) to 1271 residues (bcr). The sequence similarity shared by all these proteins is limited to a region of about 180 amino acids, generally located in their N-terminal or central section. As a signature pattern, we selected the most conserved part of this domain. -Consensus pattern: L-x(2)-[LIVMFYW]-L-x(2)-P-[LIVM]-x(2)-[LIVM]-x-[KRS]-x(2)- L-x-[LIVM]-x-[DEQ]-[LIVM]-x(3)-[ST] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in SWISS-PROT: NONE. -Last update: November 1995 / Pattern and text revised. [ 1] Bourne H.R., Sanders D.A., McCormick F. Nature 349:117-127(1991). [ 2] Boguski M.S., McCormick F. Nature 366:643-654(1993). [ 3] Downward J. Curr. Biol. 2:329-331(1992). [ 4] Miki T., Smith C.L., Long J.E., Eva A., Fleming T.P. Nature 362:462-465(1993). ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ______________________________________________________________________________ Myristyl G~(E,D,R,K,H,P,F,Y,W)x2(S,T,A,G,C,N)~(P) G~(E,D,R,K,H,P,F,Y,W)x{2}(C)~P 27: RVTWD GAQVCE LAQAL G~(E,D,R,K,H,P,F,Y,W)x{2}(C)~P 40: QALRD GVLLCQ LLNNL G~(E,D,R,K,H,P,F,Y,W)x{2}(S)~P 87: CCEKF GLKRSE LFEAF G~(E,D,R,K,H,P,F,Y,W)x{2}(A)~P 786: STKYF GTAKAR YDFCA ************************* * N-myristoylation site * ************************* An appreciable number of eukaryotic proteins are acylated by the covalent addition of myristate (a C14-saturated fatty acid) to their N-terminal residue via an amide linkage [1,2]. The sequence specificity of the enzyme responsible for this modification, myristoyl CoA:protein N-myristoyl transferase (NMT), has been derived from the sequence of known N-myristoylated proteins and from studies using synthetic peptides. It seems to be the following: - The N-terminal residue must be glycine. - In position 2, uncharged residues are allowed. Charged residues, proline and large hydrophobic residues are not allowed. - In positions 3 and 4, most, if not all, residues are allowed. - In position 5, small uncharged residues are allowed (Ala, Ser, Thr, Cys, Asn and Gly). Serine is favored. - In position 6, proline is not allowed. -Consensus pattern: G-{EDRKHPFYW}-x(2)-[STAGCN]-{P} [G is the N-myristoylation site] -Note: we deliberately include as potential myristoylated glycine residues, those which are internal to a sequence. It could well be that the sequence under study represents a viral polyprotein precursor and that subsequent proteolytic processing could expose an internal glycine as the N-terminal of a mature protein. -Last update: October 1989 / Pattern and text revised. [ 1] Towler D.A., Gordon J.I., Adams S.P., Glaser L. Annu. Rev. Biochem. 57:69-99(1988). [ 2] Grand R.J.A. Biochem. J. 258:625-638(1989). ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ______________________________________________________________________________ Pkc_Phospho_Site (S,T)x(R,K) (S)x(R) 20: RVLPP SHR VTWDG (S)x(R) 312: KLEEC SQR ANNGR (T)x(R) 321: NNGRF TLR DLLMV (T)x(R) 379: KRDNE TLR QITNF (S)x(K) 528: FEETT SCK ACQML (T)x(K) 574: QDFPG TMK KDKLH (S)x(K) 708: AEFAI SIK YNVEV (T)x(K) 718: VEVKH TVK IMTAE (T)x(K) 731: GLYRI TEK KAFRG (S)x(K) 750: FYQQN SLK DCFKS (S)x(K) 781: RPAVG STK YFGTA (T)x(K) 787: TKYFG TAK ARYDF (S)x(K) 803: DRSEL SLK EGDII ***************************************** * Protein kinase C phosphorylation site * ***************************************** In vivo, protein kinase C exhibits a preference for the phosphorylation of serine or threonine residues found close to a C-terminal basic residue [1,2]. The presence of additional basic residues at the N- or C-terminal of the target amino acid enhances the Vmax and Km of the phosphorylation reaction. -Consensus pattern: [ST]-x-[RK] [S or T is the phosphorylation site] -Last update: June 1988 / First entry. [ 1] Woodget J.R., Gould K.L., Hunter T. Eur. J. Biochem. 161:177-184(1986). [ 2] Kishimoto A., Nishiyama K., Nakanishi H., Uratsuji Y., Nomura H., Takeyama Y., Nishizuka Y. J. Biol. Chem. 260:12492-12499(1985). ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ______________________________________________________________________________ Rgd RGD 437: LICKR RGD SYDLK **************************** * Cell attachment sequence * **************************** The sequence Arg-Gly-Asp, found in fibronectin, is crucial for its interaction with its cell surface receptor, an integrin [1,2]. What has been called the 'RGD' tripeptide is also found in the sequences of a number of other proteins, where it has been shown to play a role in cell adhesion. These proteins are: some forms of collagens, fibrinogen, vitronectin, von Willebrand factor (VWF), snake disintegrins, and slime mold discoidins. The 'RGD' tripeptide is also found in other proteins where it may also, but not always, serve the same purpose. -Consensus pattern: R-G-D -Last update: December 1991 / Text revised. [ 1] Ruoslahti E., Pierschbacher M.D. Cell 44:517-518(1986). [ 2] d'Souza S.E., Ginsberg M.H., Plow E.F. Trends Biochem. Sci. 16:246-250(1991). ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^