Inteins Motifs

Comparative analysis of intein sequences reveals conserved motifs that can be used to identify and characterize inteins. Four conserved motifs, blocks A, B, F, and G, are found in all known inteins. Blocks A and B are present at the intein N terminus, and blocks F and G are present at the C-terminal end of the intein. Pietrokovski characterized two additional motifs (N2 and N4) that are located close to the N-terminal. Inteins with an endonucle-ase domain have another four conserved motifs (blocks C, D, E, and H). Studies using site-directed mutagenesis and comparative sequence and structure analyses indicate that the N- and C-terminal motifs (blocks A, N2, B, N4, F, and G) are involved in protein splicing, whereas the endonu-clease activity involves the central blocks C, D, E, and H. The amino acid of the extein following the intein insertion site and block A at the N termi-nus of the intein contain residues chemically essential for splicing. Blocks C and E are the dodecapeptide motifs required for endonuclease activity.

Comparison of names used for conserved intein motifs. The first column gives the abbreviation used in the older literature the second column gives the names suggested in (Pietrokovski S. 1998).

Perler et al. Pietrokovski Other names
A N1 N-terminal splicing motif
- N2 -
B N3 -
- N4 -
C EN1 DOD, LAGLIDADG motif
D
EN2
-
E EN3 DOD, LAGLIDADG motif
H EN4 -
- HNH HNH endonuclease motif
F C2 -
G C1 C-terminal splicing motif

Motif A starts at the amino acid (aa) preceding the amino (N') end of the inteins and motif G ends at the aa following the carboxyl (C') end of the inteins. The C' splice junction area is composed of two motifs (F and G) which are either consecutive or separated by one or two aa. The second dod motif (E) is preceded by motif D by six or nine aa. The distance between the two dod motifs (C and E) is also conserved.