|
InBase Reference: Perler, F. B. (2002). InBase, the Intein Database. Nucleic Acids Res. 30, 383-384.
The Endonuclease Motifs Page will not be updated after July 7, 2003
Please note: Finding DOD homing endonuclease motifs is not sufficient evidence that the gene contains an intein since DOD homing endonucleases are also present in introns and as free standing genes.
DOD (LAGLIDADG) Homing Endonuclease Motifs
The entire intein sequence is present in the individual intein files.
Central intein motifs C, D, E & H are in the DOD family core homing endonuclease domain (Duan 1997, Hall 1997, Klabunde 1998, Perler 1998, Perler 1997, Pietrokovski 1998, Dalgaard 1997, Belfort 1997, Jurica 1999 and Mueller 1994). These motifs form 4 conserved helices (Duan 1997 and Heath 1997). Blocks C and E are the original LAGLIDADG motifs and each contains an endonuclease active site Asp (D) or Glu (E). Block D contains a putative active site Lys (K), as observed in the Sce VMA intein (Duan 1997). Several inteins have mutations in these active site residues and therefore may not be active endonucleases although the remainder of the motif is present.
Mini-inteins are indicated as 'none' in Blocks C, D, E & H.
Dashes indicate that the individual motif ihas not been found.
The Ssp GyrB intein has an HNH family homing endonuclease between intein Blocks B and F.
The Ssp DnaX and Cau RIR1 inteins have a DOD homing endonuclease present in a different reading frame than the intein.
The position of the last amino acid in each block is listed to the right of the block.
An individual or amino acid group designation (see The Consensus Key below) in the consensus line indicates that the amino acid (upper case letter) or group (lower case letter) is present in a majority of the first 38 inteins sequenced, excluding the highly similar allelic inteins Perler 1997.
Dots in the consensus motifs indicate the position of non-conserved amino acids.
Please note: the absence of motif annotation indicates that the record was submitted without this information and it has yet to be added by the curator.
Intein Block C Block D Block E Block H
Consensus LhG..hhaG .K.IP..h .L.GhFahDG p.S..hh..h..LL..hGI |
| Eucarya |
Intein Block C Block D Block E Block H
APMV Pol
Abr PRP8 None None None None
Aca PRP8 FVGFWLCDG140 NKRIPLMY 394 LLAGLIDSDG 414 RISPTLFWDIVTLARSLGL 448
Afu-Af293 PRP8 YFLGLWLGD 357 RKHIPSIY 678 VLAGLIDSDG 698 RWHSKLFWDVVALARSLGL 732
Afu-FRR0163 PRP8 YFLGLWLGD 357 RKHIPSIY 678 VLAGLIDSDG 698 RWHSKLFWDVVALARSLGL 732
Ani-FGSCA4 PRP8 YFLGLWLGD 155 KKRIPQVF 464 VLAGLLDSDG 484 LCHKELFWDVVTLARSLGF 518
Avi PRP8 None None None None
Bci PRP8 FLGLWLGDG 391 KKHIPSIY 697 VLAGLIDSDG 717 IWHKTLFWDVVALARSLGL 751
Bde-JEL197 RPB2 LFGLVLSAG 166 YCLLPKWL 251 FLGALFGGNG 271 ESTVVYLKQVATLLGMLNI 318
Bde-JEL423 PRP8-1 SFGYLLGAW 233
Bde-JEL423 PRP8-2 FVGLWLGEY 186 IPRIPSAY 341 LLAGYASCFV 361 TDLAPLFGDLETVSRGCGL 395
Bde-JEL423 RPC2 LFGLVLSAG 166 YCLLPKWL 251 FLGALFGGNG 271 ESTVVYLKQVATLLGMLNI 318
Bde-JEL423 eIF-5B CIGLWILLG 195 CKNNMIFENIKRLSSSLGL 326
CIV RIR1 THGFFCGDG 131 -------- WLAGYLDADG 204 CIHLDFLKRIQLLLIGMGV 237
CV-NY2A ORF212392 ILGAWLGDG 171 NKHIPDDI 253 ILAGLLDTDG 273 QKNKRLSEDIAFVARSLGY 302
CV-NY2A RIR1 TKGFFSADG 124 WLAGFMDGDG 200 SINKSFLQDIRLMLQTIGI 233
CZIV RIR1 None None None None
Cba-WM02.98 PRP8 None None None None
Cba-WM728 PRP8 None None None None
Ceu ClpP FFGLWIANG 151 NKYLPDWV 230 ---------- STSERFANDVSRLALHAGT 281
Cga PRP8 None None None None
Cgl VMA LLGTWIASK 187 VTNHSSKL 273 LALSMINQDG 286 VEIDSSINGLAKLSRSLGL 339
Cla PRP8 FLGLWLGDD 150 SKHIPQVY 382 VLAGLLDSHG 402 ETDGRLFWDAVHLARSLGF 435
Cne-A PRP8 (Fne-A PRP8) None None None None
Cne-AD PRP8 (Fne-AD PRP8) None None None None
Cne-JEC21 PRP8 None None None None
Cpa ThrRS None None None None
Cre RPB2 LYGYWLGDG 167 AKWFAGWV 290 ILVGLSFADG 310 TSSARFRDDTVRLALHAGY 339
Cst RPB2 LFGLFLSES 154 RKNLPDFI 229 LPNSSFADDFQRLALHAGY 281
Ctr ThrRS
Ctr VMA LLGTWAGIG 210 VKSIPQHI 325 LIAGLVDAAG 345 TSFRHVARGLVKIAHSLGI 379
Ddi RPC2
Dhan GLT1 LGLWLGDGD 363 NQNKPEGC 395 VIAGLMDSDG 484 EDHKKIVYDLKELALSCGI 518
Dhan VMA
Eni PRP8
Eni-FGSCA4 PRP8 YFLGLWLGD 155 KKRIPQVF 464 VLAGLLDSDG 484 LCHKELFWDVVTLARSLGF 518
Gth DnaB None None None None
HaV01 Pol
Hca PRP8 FVGFWLCDG 140 NKRIPLMY 393 LLAGLIDSDG 413 RISPTLFWDIVTLARSLGL 447
IIV6 RIR1 THGFFCGDG 131 -------- WLAGYLDADG 204 CIHLDFLKRIQLLLIGMGV 237
Kex-CBS379 VMA LIGFWVGNG 208 VSKIAHEL 344 FVAGLVDATG 364 IESKSTVLGLVKIARSLGI 399
Kla-CBS683 VMA LLGAFVGSS 206 SSGIPSFM 282 FLAGIVDSQN 302 TLSVKTHDGIARLARSLGI 330
Kla-IFO1267 VMA LLGAFVGSS 206 SSGIPSFM 282 FLAGIVDSQN 302 TLSVKTHDGIARLARSLGI 331
Kla-NRRLY1140 VMA LLGAFVGSS 206 SSGIPSFM 282 FLAGIVDSQN 302 TLSVKTHDGIARLARSLGI 331
Lel VMA
Nau PRP8 None None None None
Nfi PRP8 FLGLWLGDG 156 RKHIPSIY 378 VLAGLIDSDG 396 RCHSKLFWDVVALARSLGL 430
Ngl-FR2163 PRP8 None None None None
Ngl-FRR1833 PRP8 None None None None
Nqu PRP8 None None None None
Nspi PRP8 None None None None
Pan CHS2 WFGFWLGNG 405 VLAGLIESDG 529 YELKKLVEDARKLALSCGI 563
Pan GLT1 LLGFWLGDG 435 NASRPAGA 468 VIAGLIDSDG 557 DEHRKIVEDLRDLALSCGI 590
Pbl PRP8-a None None None None
Pbl PRP8-b None None None None
Pbr PRP8 YFLGLWLGD159 KKRIPSVY 432 LLAGLIDSGG 452 DSNSTLIWDVVTLARSLGF 486
Pch PRP8 None None None None
Pex PRP8 None None None None
Pgu GLT1 FLGLWLGEG 311 NKDKPAEA 344 LLAGLVDSDG 431 ELHKKIVYDLKEMATSCGI 465
Pgu-alt GLT1 FLGLWLGEG 311 NKDKPAEA 344 LLAGLVDSDG 431 ELHKKIVYDLKEMATSCGI 465
Pno GLT1 FLGIWMGDG 386 NRQCPDGA 419 FIAGLVESDG 507 DDHKKIIYDLKDLALSCGI 523
Pno RPA2
Ppu DnaB None None None None
Pst VMA LVGLWVGDG 211 FKLVPSFL 301 FISGLVDSDG 321 TIYPAVRDGLVSVARSLGI 356
Ptr PRP8 FLGLWLGDG 370 RKHIPTAY 703 VLAGLLDSDG 723 LWHKTLFWNTVALARSLGF 757
Pvu PRP8 None None None None
Pye DnaB None None None None
Sas RPB2 IGGCLLGRG 140 ARQFPHWV 209 VASKHLADDIQRLCLHAAS 265
Sca-CBS4309 VMA LLGFWVGNG 209 EKSVPLHL 372 FIAGLIDSDG 392 TIYKGVSEGLIRLARSLGI 415
Sca-IFO1992 VMA LLGFWVGNG 209 EKSVPLHL 372 FIAGLIDSDG 392 TIYKGVSEGLIRLARSLGI 415
Scar VMA FLGLWIGDG 219 VKNIPSFL 307 FLAGLIDSDG 327 TIHTSVRDGLVSVARSLGL 359
Sce VMA LLGLWIGDG 219 VKNIPSFL 307 FLAGLIDSDG 327 TIHTSVRDGLVSLARSLGL 359
Sce-DH1-1A VMA LLGLWIGDG 219 VKNIPSFL 307 FLAGLIDSDG 327 TIHTSVRDGLVSLARSLGL 359
Sce-OUT7091 VMA LLGLWIGDG 211 VKNIPSFL 307 FLAGLIDSDG 327 TIHTSVRDGLVSLARSLGL 359
Sce-OUT7112 VMA LLGLWIGDG 219 VKNIPSFL 307 FLAGLIDSDG 327 TIHTSVRDGLVSLARSLGL 359
Sda VMA LLGFWVGNG 210 VKAIPMHL 353 FIAGLVDSIG 373 TAYKSVSEGLIRLARSLGI 408
Sex-IFO1128 VMA LIGFWVGNG 208 VTKIAHEL 344 FVAGLVDAAG 364 TESKSTVLGLVKIARSLGI 399
She RPB2 (RpoB) FFGIWIAEG 147 YRKLPIWV 230 TSSVELADDISRLALHAGW 307
Sja VMA LLGVWVGAG 217 VKSIPSHF 331 FLAGLIDADG 351 TTSPKVRDGTVRIARSLGI 383
Spa VMA LLGLWIGDG 219 VKNIPSFL 307 FLAGLIDSDG 327 TIHTSVRDGLVSLARSLGL 359
Spu PRP8 FLGIWLGNG 221 AKHIPVGY 300 LLAGLIDANG 320
Sun VMA VMGSWVGAG 206 TKTIPKEL 267 FIAGLVDTQG 287 TVYERISNDLVKLARSLGI 322
Tgl VMA LLGLWVGDG 221 VKHVPSYL 309 FLAGLIDSDG 329 TIHKTVMEGTVAVARSLGL 361
Tpr VMA LLGLWAGDV 221 VKNVPTYL 309 FLAGLIDSDG 329 TTLKTVMAGTVAVARSLGL 361
Ure PRP8 None None None None
Vpo VMA LLGLWTGSG 219 NKTVPSFL 392 FLAGLIDSNG 302 LEDNKVMSGIVSLIRSLGL 295
WIV RIR1 None None None None
Zba VMA LLGLWVGDG 221 VKHVPSYL 309 FLAGLIDSDG 329 TIHKTVMEGTVAVARSLGL 361
Zbi VMA FLGLWIGGG 218 SKNVPAYL 306 FLAGLVDSEG 326 TCHHSVMTGVVAVARSLGI 355
Zro VMA FLGLWIGDG 220 SKNVPGYI 309 FLAGIIDSTG 329 TIDQSVMTGTVAVARSLGI 358 |
| Eubacteria |
Intein Block C Block D Block E Block H
AP-APSE1 dpol None None None None
Aae RIR2 LLIVLQADG 149 TKFFDEWV 223 FVEELVKWDG 245 STKEKRNKDFVQALCALGG 278
Aave Hyp-1721 VLGWVVTEG 137 LVDRMIEGDG 238
Aave RIR LLGLLIGDG 125 LLRGLFDTDG 244 QSNPALLQTVQRMLLRLGI 278
Ace RIR1 None None None None
Aeh DnaB-1 LLAYFLGDG 121 EKAVPEVV 278 FLSRLFACDG 298 TSSRALARDVQHLLLRFGI 332
Aeh DnaB-2 LLAHLVGDG 120 EKHLPQEV 201 FLRHLWATDG 221 TASRHLIQDVAALLLRFGI 255
Aeh RIR1 LMGLLLGDG 125 FLRGLYDADG 233 QSDSDNLEAAQRMLLRLGI 267
Aha DnaE-c None None None None
Aha DnaE-n None None None None
Aov DnaE-c None None None None
Aov DnaE-n None None None None
Arsp-FB24 DnaB LLGYWLGDG 124 NKHIPEEY 200 LLQGLLDSDG 220
Asp DnaE-c None None None None
Asp DnaE-n None None None None
Ava DnaE-c None None None None
Ava DnaE-n None None None None
Avin RIR1 BIL
Bce DnaB VLGALLGDG 126 DKFIPRLY 206 VLRGLLDADG 226 TASAQLASDVRELARSLGA 256
BsuP-M1918 RIR1 IMGIIAGDG 131 KTRVPEFI 219 YLSGLFQTDG 239 SIHYESLQDVQKLLLNMGV 274
BsuP-SPBc2 RIR1 IMGIIAGDG 131 KTRVPEFI 219 YLSGLFQTDG 239 SIHYESLQDVQKLLLNMGV 274
Cag RIR1 LIGLLTGDG 114 TDKVFQLA 210 LLRGLFTADG 226 ATSLELLQQVQLLLFNFGI 260
Cau SpoVR
CbP-C-St RNR LIGAYLGDG 129 VLDGIYFTDG 228 TTSDSLKEDLEILIHSLGM 254
CbP-D RNR LIGAYLGDG 129 VLDGIYFTDG 228 TTSDSLKEDLEILINSLGM 254
Cbu DnaB None None None None
Cch RIR1 LIGLLTGDG 114 TDKVFQLA 210 LLRGLFTADG 226 ATSLELLQQVQLLLFNFGI 260
Chy RIR1 FLGYFLGDG 113 EKDIPSSL 188 LLRGLFSADG 208 STSYPLLRKVQILLLSLGI 241
Ckl PTerm ILGVWIGNG 154 NKHIPEIY 264 LLKGLMDTDG 284 QKNKIIIDGFSNLLSSLGI 314
Cth ATPase BIL None None None None
Cth TerA LFGYWIGNG 131 EKRIPIEY 191 LLQGLIDSDG 211 TILFELAKDVQDLLWSLGI 242
Cwa DnaB
Cwa DnaE-c None None None None
Cwa DnaE-n None None None None
Cwa PEP FLGGIITDG 132 FLGGVIDGDG 238 KNRLHIYISEENLLQAVII 260
Cwa RIR1
Dge DnaB LLAYLLAEG 122 NADPELVQ 140 FLRVLLSCDG 226 VASEGLARDVHHALVRFGI 259
Dha-DCB2 RIR1 LLGLLLGDG 124 SKAITQEL 202 LLRGLYDTDG 222 TDLAGLKVVQRMLQRLGII 257
Dha-Y51 RIR1 LLGLLLGDG 124 SKAITQEL 202 LLRGLYDTDG 222 TDLAGLKVVQRMLQRLGII 257
Dra RIR1 LLGSLIGDG 124 HKTLTDKV 202 VLQGLFDADG 222 SDLSLLKRAQRMLSRLGIM 257
Dra Snf2-c VLQGLLDTDG C39 SVSEHLARGVVELVQSLGG C67
Dra Snf2-n YTLGALLGD N129 YKFIPPDY N205
Dra-ATCC13939 Snf2 YTLGALLGD 129 YKFIPPDY 205 VLQGLLDTDG 225 SVSEHLARGVVELVQSLGG 254
Dvul ParB
Fal DnaB VLGSWLGDE 143 DRHIPAGY 270 LLAGLLDADG 290 TTNPRLAREVRELVLSLGC 321
Fsp-CcI3 RIR1 None None None None
Gob DnaE
Gob Hyp
Gvi DnaB None None None None
Gvi RIR1-1
Gvi RIR1-2
Hhal DnaB-1 LLGHLIGDG 120 EKTVPDCV 276 FLNRLFSSDG 295 FSSDGWVTHLASGQGQIGY 309
Hhal DnaB-2 None None None None
Kra DnaB LLGHLIGDG 116 VKRQPLRY 134 FLRHLWATDG 229 STSLQLIQDVSRLLLRFGI 264
LP-phiHSIC Helicase
MP-Aaphi23 MupF
MP-Be DnaB FLGLWLGDG 130 NKHIPAAY 208 LLRGLIDSDG 228 NANRNLVYQFQELVVGLGF 258
MP-Be gp51
MP-Catera gp206 LLGAWLGDG 126 NKHIPDRY 190 LLAGLMDSDG 201 MKNERLMRQVLQLVRSLGY 239
MP-Mcjw1 DnaB ILGAWLGDG 126 AKHVPQDY 197 LLQGLMDTDG 207 NTNRDLAEAALFLARSLGW 240
MP-Omega DnaB LLGYWLGDG 135 NKHIPETY 200 LLAGIMDSDG 220 MKNEALMRQVLMLARSLGY 259
MP-U2 gp50
Mav DnaB TLGAWLGDG 147 NKHIPTEY 196 LLAGLLDTDG 216 VTNQRLARDVNELIVSLGY 246
Mav-104 DnaB TLGAWLGDG 147 NKHIPTEY 196 LLAGLLDTDG 216 VTNQRLARDVNELIVSLGY 246
Mav-PT DnaB TLGAWLGDG 147 NKHIPTEY 276 LLAGLLDTDG 298 VTNQRLARDVNELIVSLGY 326
Mbo Pps1 LAGYYLAEG 144 NKKLSDLL 223 LVDAYVNGDG 243 TTSRLWAFQLQSILARLGH 276
Mbo RecA LLGYLIGDG 123 EKTIPNWF 201 LLFGLFESDG 223 TTSEQLAHQIHWLLLRFGV 257
Mbo SufB (Mbo Pps1) LAGYYLAEG 144 NKKLSDLL 223 LVDAYVNGDG 243 TTSRLWAFQLQSILARLGH 276
Mbo-1173P DnaB SLARMIGDG 123 EKCVPEAV 206 FLRHLWSAGG 226 STSRRLIDDVAQLLLRVGI 261
Mbo-AF2122 DnaB SLARMIGDG 123 EKCVPEAV 206 FLRHLWSAGG 226 STSRRLIDDVAQLLLRVGI 261
Mca MupF
Mca RIR1 HRALPSWDG 108 RLTAPLET 218 LLRGMFDADG 236 QTDLGNLQTVQRMLLRLGV 270
Mch RecA VLGSLMGDG 123 -------- LQRAVYLGDG 193 FLSEEYLKALTPLALAIWY 214
Mex Helicase
Mex TrbC
Mfa RecA ILGSLMGGG 123 -------- LRRAVYLGDG 193 FLSEDYLKALTPLALAVWY 214
Mfl GyrA HLGAFISEG 134 DKAVPEWL 212 FLQALFEGDG 232 TRSGRLAKDIQQMLLEFGV 266
Mfl RecA VLGSLMGDG 123 -------- LQRAVYLGDG 193 FLSEENFKALTPLALVFWY 214
Mfl-ATCC14474 RecA VLGSLMGDG 123 -------- LQRAVYLGDG 193 FLSEEYLKALTPLALAIWY 214
Mfl-PYR-GCK DnaB TLGAWLGDG 147 AKHIPIDY 276 LLAGLLDTDG 296 GTNARLIGDVAELVVSLGY 326
Mga GyrA LLGAFISEG 134 DKYVPEWM 212 FLRALFEGGG 232 TISKQLAMDVQQMLLEFGV 266
Mga RecA VLGSLMGDG 126 -------- LRSAVYLGDG 197 FLSEEYLKGLTPLSLAIWY 218
Mga SufB (Mga Pps1) LLGLYVGDG 137 TKRVPDWV 213 FLGGWVDADG 230 CANQALIGQARELAELAGL 272
Mgi-PYR-GCK DnaB TLGAWLGDG 147 AKHIPIDY 276 LLAGLLDTDG 296 GTNARLIGDVAELVVSLGY 326
Mgi-PYR-GCK GyrA LLGAFISDG 134 AKTVPNWL 212 FLQALFEGDG 232 TRSGQLAKDVQNMLLEFGV 266
Mgo GyrA LLGAFISEG 134 DKSVPEWL 212 FLQALFEGDG 232 TRSRQLAIDVQQMLLEFGV 266
Min DnaB TLGAWLGDG 147 NKHIPTEY 195 LLAGLLDTDG 215 VTNQRLACDVAELIVSLGY 245
Mkas GyrA LLGAFISEG 134 DTYVPEWM 212 FLQALFEGDG 232 TVSKQLAMDVQQMLLEFGV 266
Mle DnaB None None None None
Mle GyrA LFGAFISGG 134 DKLVPDWL 212 FLQALFEGEG 232 TLSERLAADVQQMLLEFGV 266
Mle RecA VLGSLMGDG 123 -------- LQRAVYLGDG 194 FSLEEYLKALTPLVLAIWY 215
Mle SufB (Mle Pps1) LLGLWLGDG 151 TKRLPAWI 225 LIGGLVDADG 245 FASRELLEDVRQLAIGCGL 276
Mma GyrA LLGAFISEG 134 DKSVPDWL 212 FLQALFEGGG 232 TRSRQLAVDVQQMLLEFGI 266
Mmag Magn8951 BIL None None None None
Msh RecA VLGSLMGDG 123 -------- LQRAVYMGDG 193 FLSEEYLKALTPLALAIWY 214
Msm DnaB-1 None None None None
Msm DnaB-2 MLAHMIGDG 123 EKFVPAQV 207 FLRHLWATDG 227 TTSRQLADDVVQLLLRVGV 262
Msp-KMS DnaB TLGAWLGDG 134 DKHIPIEY 181 LLAGLLDTDG 201 VTNKRLAADVAELVVSLGY 231
Msp-KMS GyrA LAGAFISEG 134 DKFVPEWI 212 FLQALFEGDG 232 TRSERLAADVQQMLLEFGI 266
Msp-MCS DnaB TLGAWLGDG 134 DKHIPIEY 181 LLAGLLDTDG 201 VTNKRLAADVAELVVSLGY 231
Msp-MCS GyrA LAGAFISEG 134 DKFVPEWI 212 FLQALFEGDG 232 TRSERLAADVQQMLLEFGI 266
Mthe RecA VLGSLMGDG 123 -------- LRRAVYLGDD 193 FISEEYLKALTPLALAIWY 215
Mtu SufB (Mtu Pps1) LAGYYLAEG 144 NKKLSDLL 223 LVDAYVNGDG 243 TTSRLWAFQLQSILARLGH 276
Mtu-CDC1551 DnaB SLARMIGDG 123 EKCVPEAV 206 FLRHLWSAGG 226 STSRRLIDDVAQLLLRVGI 261
Mtu-F11 DnaB SLARMIGDG 123 EKCVPEAV 206 FLRHLWSAGG 226 STSRRLIDDVAQLLLRVGI 261
Mtu-H37Ra DnaB SLARMIGDG 123 EKCVPEAV 206 FLRHLWSAGG 226 STSRRLIDDVAQLLLRVGI 261
Mtu-H37Rv DnaB SLARMIGDG 123 EKCVPEAV 206 FLRHLWSAGG 226 STSRRLIDDVAQLLLRVGI 261
Mtu-H37Rv RecA LLGYLIGDG 123 EKTIPNWF 201 LLFGLFESDG 223 TTSEQLAHQIHWLLLRFGV 257
Mtu-So93 RecA LLGYLIGDG 123 EKTIPNWF 201 LLFGLFESDG 223 TTSEQLAHQIHWLLLRFGV 257
Mvan DnaB TLGAWLGDG 147 NKHIPTEY 276 LLAGLLDTDG 296 VTSRRLAADVAELVVSLGY 326
Mvan GyrA LLGAFISEG 134 AKMVPEWL 212 FLQALFEGDG 232 TRSGQLAKDVQQMLLEFGV 266
Mxa RAD25
Mxe GyrA None None None None
Nfa DnaB LLAHMIGDG 115 EKFIPRRV 207 FLRHLWATDG 227 STSRRLIDDVAQLLLRLGV 262
Nfa Nfa15250
Nfa RIR1
Npu DnaB LLGHLIGDG 121 EKFVPREL 207 FLRHLWSTDG 227 SSSERLAFDVQTLLLRLGI 263
Npu DnaE-c None None None None
Npu DnaE-n None None None None
Npu GyrB --------- -------- ---------- HNH FAMILY ENDONUCLEASE
Nsp-JS614 DnaB VLGAWLGDG 146 DKHIPADY 211 LLAGLMDTDG 231 VTNKRLADDVYELVVSLGY 262
Nsp-JS614 TOPRIM AHGFTYGDG 122 FKELPDPD 173 WLAGYFAADG 191 SARRENLEFVRMVTTRLGI 221
Nsp-PCC7120 DnaB LLGHLIGDG 121 EKFVPQEL 207 FLRHLWSTDG 227 TSSYRLAVDVQTLLLKLGI 263
Nsp-PCC7120 DnaE-c None None None None
Nsp-PCC7120 DnaE-n None None None None
Nsp-PCC7120 RIR1 LIGFTHGDG 140 QPNIPLTV 223 YLAGLMDSDG 247 SVYRSFIRQVSVVLSSLGI 277
Oli DnaE-c None None None None
Oli DnaE-n None None None None
PP-PhiEL Helicase
PP-PhiEL ORF11
PP-PhiEL ORF39 VLGVLLGDG 139 GKVIPEEY 224 LLQGLLDTDG 244 DKHKSVSFSSSSKLLSLGV 265
PP-PhiEL ORF40
Pfl Fha BIL None None None None
Plut RIR1 LIGLLVGDG 114 LLRGLFTADG 226 STSLDLLLQVQLLLLNFGI 260
Pna RIR1 LIGLITGDG 125 KLEVPEVV 211 YLQGLFQTDG 231 SSHRPLLQDVQVLLANFGV 266
Pnuc DnaB VLGALLGDG 126 DKYIPATY 211 LFQGLMDTDG 231 TASKQLSEDVASLARSLGG 261
Posp-JS666 DnaB VLGGLLGDG 126 VLRGLLDTDG 229 TASHQLAKDVQELVRSLGG 259
Posp-JS666 RIR1 LLGLLIGDG 125 EMGMRPGH 211 LLRGLFDADG 236 QSDLSLLQTAQRMLLRLGI 270
Pssp-A1-1 Fha None None None None
Psy Fha None None None None
Rma DnaB LLGHLIGDG 121 EKKVPALL 206 FLRHLWATDG 226 TSSYQLARDVQSLLLRLGI 263
Rsp RIR1 LLGLLIGDG 124 RKTITPEI 207 VLRGLFDADG 228 QSDLALLQAAQRMLARLGM 262
SaP-SETP12 dpol
SaP-SETP3 Helicase LMGLWLGDG 137 NKHIPHNY 208 LLAGLLDSDG 228 SVSERLADDFCYLCRSLGF 259
SaP-SETP3 dpol
SaP-SETP5 dpol
Sav Helicase YLFGLLLGD 130 GKFIPEDF 208 LLQGLLDTDG 228 SVSLRSASLRLAEDVAWLV 254
Sel-PC6301 RIR1 LLGNFIGDG 125 SFGLKQGL 199 FLRGLFDADG 226 QGSQEKGVSVRLAQSDLGL 247
Sel-PC7942 DnaE-c None None None None
Sel-PC7942 DnaE-n None None None None
Sel-PC7942 RIR1 LLGNFIGDG 125 SFGLKQGL 199 FLRGLFDADG 226 QGSQEKGVSVRLAQSDLGL 247
Sel-PCC6301 DnaE-c None None None None
Sel-PCC6301 DnaE-n None None None None
Sep RIR1
ShP-Sfv-2a-2457T-n Primase VIGSLLGDG 128 NKFIPRVF 199 MLCGLLETDG 219
ShP-Sfv-2a-301-n Primase VIGSLLGDG 128 NKFIPRVF 199 MLCGLLETDG 219
ShP-Sfv-5 Primase VIGSLLGDG 128 NKFIPRVF 199 MLCGLLETDG 219 SASEELRNGVVQLVNSLGG 249
Spl DnaX None None None None
Sru DnaB FLGLWLGDG 129 DKHIPHLY 206 LLAGLIDSDG 226
Sru PolBc
Sru RIR1 ILGMWQSDG 120 KGTVPEWI 211 YVRGLLVADG 231 DVDRDFLQELQLLFNNLGL 267
Ssp DnaB LLGHLIGDG 122 EKFVPNQV 207 FLRHLWSTDG 227 TSSEKLAKDVQSLLLKLGI 263
Ssp DnaE-c None None None None
Ssp DnaE-n None None None None
Ssp DnaX DOD FAMILY ENDONUCLEASE IS IN A DIFFERENT READING FRAME
Ssp GyrB --------- -------- ---------- HNH FAMILY ENDONUCLEASE
Ssp-JA2 DnaB None None None None
Ssp-JA2 RIR1 LLGNLLGDG 125 FLCGLFDADG 225 QSHLGTLKAVQRMLARLGI 259
Ssp-JA3 DnaB None None None None
Ssp-JA3 RIR1 LLGNLLGDG 125 FLCGLFDADG 225 QSNLDTLKAVQRMLARLGI 259
StP-Twort ORF6
Susp-NBC371 DnaB intein FLGLWLGDG 129 NKHIPRSY 207 LLAGLIDSDG 227 LGFRSSLVKKKASIKAIGY 274
Tel DnaE-c None None None None
Tel DnaE-n None None None None
Ter DnaB-1
Ter DnaB-2 None None None None
Ter DnaE-1 None None None None
Ter DnaE-2
Ter DnaE-3c None None None None
Ter DnaE-3n None None None None
Ter GyrB --------- -------- ---------- HNH FAMILY ENDONUCLEASE
Ter Ndse-1
Ter Ndse-2
Ter RIR1-1 FLGYLSGNG 137 YLAGLVDADG 234 SVDQGFLRQVQALYASLGI 262
Ter RIR1-2 MLGWWYRDG 120 FLQGIFSADG 226 MVSEKLLQQIQLILSNLGI 259
Ter RIR1-3 TLGAALGNG 122 FIAGLADTDG 205 VSDYERAKRLQLLLTKCGI 235
Ter RIR1-4 FLGCLFGNG 153 FFCGLIDTNG 251 SASSDFIHNLQQIGESIGL 281
Ter Snf2 FVAWQVAEG 135 EKTIPLFI 220 FLSNYFDAEG 249 TASSQLIQELSILLRRFGV 271
Ter ThyX
Tfus Hyp-2914
Tfus RecA-1 LLGYLTAAG 138 AARIPQCV 211 FLAALYTAAG 231 TASAPLAREVQYLLYGLGI 261
Tfus RecA-2 VYGSLMGRG 123 -------- LHRVVDFGDG 193 HLTWEFLKQLTPLALAVWY 214
Tth-HB27 DnaE-1
Tth-HB27 DnaE-2
Tth-HB27 RIR1-1
Tth-HB27 RIR1-2
Tth-HB8 DnaE-1
Tth-HB8 DnaE-2
Tth-HB8 RIR1-1
Tth-HB8 RIR1-2
Tvu DnaE-c None None None None
Tvu DnaE-n None None None None |
| Archaea |
Intein Block C Block D Block E Block H
Ape APE0745
Fac-Fer1 RIR1 VLGWLVGDG 119 KLNVPDKV 202 FLQALFEADG 222 SISLNLLKQVQMLLLNFGI 256
Fac-Fer1 SufB (Fac Pps1) ILGLYTAEG 140 MKLPPEKQ 229 LIDYYLKGDG 241 TASKILALQLQEMLSRNNT 274
Fac-TypeI RIR1 VLGWLVGDG 119 KLNVPDKV 202 FLQALFEADG 222 SISLNLLKQVQMLLLNFGI 256
Fac-typeI SufB (Fac Pps1) ILGLYTAEG 140 MKLPPEKQ 229 LIDYYLKGDG 241 TASKILALQLQEMLSRNNT 274
Hma CDC21
Hma Pol-II None None None None
Hma PolB
Hma TopA
Hsa-NRC1 CDC21 None None None None
Hsa-NRC1 Pol-II None None None None
Hvo PolB
Hwa GyrB
Hwa MCM-1 None None None None
Hwa MCM-2
Hwa MCM-3 None None None None
Hwa MCM-4
Hwa Pol-II-1 LLGYYAAAG 264
Hwa Pol-II-2 LLGQFIAQR 239 FLQGFILAEN 343 ESETTVTLETPSVGVKDGL 378
Hwa PolB-1 FLAGTLDGSE 272 TESASIARWYAQLYRRLGI 303
Hwa PolB-2
Hwa PolB-3
Hwa RCF
Hwa RIR1-1 LLGLWIDTG 190 IKEAPTNV 271 FLRGCFTAEG 284
Hwa RIR1-2 FLGYFMGSG 152 FLRGVFEAIG 255 TTSTTLADQLQSLLLSLGH 283
Hwa Top6B
Hwa rPol A'' None None None None
Maeo-N3 Helicase FIGYFIGDG 122 FYILPLEK 203 FIAGLFDSDG 216 SISENLIKKLQLALLRFGI 247
Maeo-N3 RtcB LLGFALGDG 157 IKKSPLWV 262 FLAGLFGADG 275 DSLLNYLADIKKMLSEFGI 318
Maeo-N3 UDP GP LIGYYLAEG 225 NKQIPPQI 305 FLKGILRGDG 328 TVSKKLANSLMILLQSLGI 363
Memar MCM2 FLGYFLSEG 153 AKSLPPEQ 231 LLDALMLGDG 244 TSSRRLADDVTELLLKKGL 275
Memar Pol-II None None None None
Mhu Pol-II None None None None
Mja GF-6P IIGYIIGDG 215 NERTPEFV 291 YLRGIFDAEG 311 MTSKCFIKEIQFLLLRFGI 342
Mja Helicase FIGYFIGDG 122 NKNIDAFC 197 LIAGLFDSDG 216 SISEKLVEQLQFVLLRFGI 247
Mja Hyp-1 --------- RSRIPEKI 156 RLVGYFLSEG 173 TTSEILMNQLRLISLRLGF 293
Mja IF2 FAGVMFGDG 215 NIKIPQIL 286 FIKGYFDADG 306 SASKEFIEGLSILLLRFEI 337
Mja KlbA None None None None
Mja PEP LGGAVLSDG 134 SRKIPSEI 196 LIAGFVDGDG 228 SSHIKKIEGLIVGLYRLGI 260
Mja Pol-1 LIGILLAEG 135 -------- ILRGFFEGDG 235 TNNYDKIKFIASLLDRLGI 268
Mja Pol-2 FLGFFVTRG 289 KKHIPEEL 364 ---------- AKDEKYLNQLMILFNLVGI 413
Mja RFC-1 WLGYFIGDG 134 KVRIPKEI 211 FLRAYFDCDG 229 TASKEMAEDLVYALLRFGI 258
Mja RFC-2 MLGLYVAEG 162 NKRIPDII 251 FLKGLADGDS 271 SKSDNLLIDTVWLARISGI 301
Mja RFC-3 LLGFIIGDG 232 GYNIPKWI 321 FLRGLFGADG 341 DKTLEFFEEVKKMLEEFEV 384
Mja RNR-1 FLGLFVAEG 178 NKNSPEFI 247 FLGGLISGDG 267 TTSEQLLGQLHLLLSDLGM 297
Mja RNR-2 LIGAFLSEG 236 NKEIPSIL 314 LIKGYIDGDG 333 TTSETLRDTLCLALKILGI 368
Mja RtcB (Mja Hyp-2) LLGFAFGDG 156 IYKIPEWI 254 FLAGLFGADG 274 ENILEFLNEIKLLLAEFDI 317
Mja TFIIB ILGYIIAEG 147 NKRIPSII 216 FIDGIFNGKD 236 FVSKELAEDVIFLLLQIKE 258
Mja UDP GD LIGYYLSEG 216 NKNIPPQM 299 FLKGLFRGDG 319 TVSKKMAHSLLILLQLLGI 354
Mja r-Gyr FAGLVLGDG 212 IFSLPESY 281 LIAGYFDTDG 294 SKRRDVLEKIGIYLNSIGI 333
Mja rPol A' --------- RKLIPLTY 146 RIVGHVMGDG 165 YSFEVRKKSLCILLKALGC 244
Mja rPol A'' FIGIYLAEG 181 TKKIAEFV 254 LIRGYFDGDG 274 SNSKELIDGIAILLARFNI 305
Mka CDC48 LLGLYVAEG 136 KRLGPLLS 206 ALRGYYTGDG 224 TVSKRLADELLVALQILDI 259
Mka EF2
Mka RFC --------- ELLRELADDG 159 EASEDVSVDLAWLARISGV 180
Mka RtcB ILGFAMGDG 159 PYKVPDWI 252 FLAGLFAADG 272 ENLREFMNDVAKLLREFGI 305
Mka VatB VAGLIASDG 239 VLKMPREL 312 YLAGYVDGDG 325 TADRERAGDLQLLLKRLGV 354
Mth RIR1 None None None None
Neq Pol-c None None None None
Neq Pol-n None None None None
Nph CDC21
Nph PolB-1 DLAGAVDDG 127 LLETLVDGDG 311 TDSGLGEFWTQSDRLKDDV 332
Nph PolB-2 LLAWYVTEG 214 DKRIPQLV 298 FMQTLISGDG 318
Nph rPol A''
Pab CDC21-1 None None None None
Pab CDC21-2
Pab IF2 LAGRKGNID 149 FLRGYFEERS 173 VEARELVEPLSLALLRFGI 200
Pab KlbA None None None None
Pab Lon LMGILFNGG 158 DLKMPWWV 231 PSLFLAFLEG?244 NKNLPFFQELSWYLGLFGI 276
Pab Moaa LIGYFVSDG 227 VFKIPEGA 291 LLSGLFNGDG 320 STSKGLIRDILYLLASLGI 353
Pab Pol-II None None None None
Pab RFC-1 WLGYFFGNG 133 KDSIPEQA 196 FLRAYFDCNA 212 TAGKEIAEQISYALAGLGI 240
Pab RFC-2 VVGFILGDG 301 GYTVPEWI 388 FLRGLFGADG 408 ERTVEFLNDVADLLREFDV 451
Pab RIR1-1 LAGFIAGNG 150 ENGIPPKI 176 FITGLFDAEG 196 MVNKKLIEAVTHYLNSLGI 226
Pab RIR1-2 LLGIIYADG 141 NIRVPEAI 227 FLAGFFDGDG 247 SISREFIKEAQLLFLALGI 277
Pab RIR1-3 VLGWFIGDG 157 EKRIPEIV 230 FLRGLFSADG 250 SKSRELLREVQDLLLLFGI 280
Pab RtcB (Pab Hyp-2) ILGFAFGDG 159 AYRVPGWI 249 FLAGLFAADG 269 ENLVEFLGDVAKLLAEFGI 312
Pab VMA FLGYLIADG 164 KLGVPRNK 227 FIKAYIMCDG 258 TASEEAAYGFSYLLAKLFI 289
Par RIR1 LLGRVVGDG 156 DKLVPEVI 238 FLRGLFDADG 258 STSKRLLREVQQLLLLFGI 288
Pfu CDC21 NO
Pfu IF2 LAGASLDIP 158 FRKHVGFTDS 225 KAKESERYPILEELRRLGL 255
Pfu KlbA LAGVILGDG 202 MWDIPDVV 279 FIAGLFDADG 298 TKSETVARKIWYVLQRLGI 329
Pfu Lon IMGALFGSG 165 KLKLPWWV 243 PSLFLAFMDG 256 DNVETFFEEISWYLSFFGI 293
Pfu RFC WLGYFMGSG 132 LTLIPREG 202 FLRAYSDCNG 218 TDNNDMAQQIAYALASFGI 247
Pfu RIR1-1 LAGFIAGDG 150 DNGIPPQI 231 FIAGLFDAEG 251 MVNKRLIEDVTHYLNALGI 281
Pfu RIR1-2 VLGWFIGDG 157 EKRIPEIV 230 FLRGLFSADG 250 SKSRELLREVQDLLLLFGI 280
Pfu RtcB (Pfu Hyp-2) ILGFALGDG 159 AYRIPVWI 252 FLAGFFGADG 272 ENIKEFLYDISRILEEFGV 314
Pfu TopA LIGYLAGKG 176 -------- FLAGYYDATL 256 GLTLEALYKIKVYLQLLGI 277
Pfu VMA FLGYVIGDG 160 KLGIPGNK 222 FINAYIACDG 253 TASEEGAYGLTYLLAKLGI 285
Pho CDC21-1 None None None None
Pho CDC21-2
Pho IF2 FAGTIFGRE 159 FLRGFFDING 212 GAPHEVLEELSLILLRLGI 243
Pho KlbA LAGVILGDG 202 EWDVPDIV 279 FIAGLFDADG 298 TKSENVARKIWYALQRLGI 327
Pho LHR LLGFWMASG 214 KLEVPPII 283 FLAGYFDGNG 303 AFNRKFAEGIRDILLQLGI 337
Pho Lon ILGALFSDG 215 KLELPWWI 297 FMDGLYSGDG 316 EKKLPFFEEIAWYLSFFGI 362
Pho Pol I LLGYYISSG 289 GKRIPEFI 351 FLKGLNGNAE 371 TKSELLVNQLILLLNSIGV 395
Pho Pol-II None None None None
Pho RFC WLGYFLGGG 132 NAHIPKEC 202 FLRAYFDCNG 218 TASKEMSQEIAYALAGFGI 247
Pho RIR1 VLGWLIGDG 160 DKRVPEIV 233 FLRGLFTADG 253 SKSRELLRDVQDLLLLFGI 283
Pho RadA None None None None
Pho RtcB (Pho Hyp-2) ILGFALANG 159 HDSIPEWI 223 FLAGLFGANG 243 THSRELLNDVSRILEGFKV 277
Pho VMA FLGYLMANG 160 RLGVPEDK 224 LASEEGAYELSYLFAKLGI 278
Pho r-Gyr GKGTLKGDK 168 -------- MIAGYFDASG 241 SKRGDILRMLSVYLYQIGI 269
Psp-GBD Pol LLGYYVSEG 289 NKRVPEVI 364 FLEGYFIGDG 384 TKSELLVNGLVLLLNSLGV 414
Pto VMA LLGLYASYG 141 HDNEQNILKISYMLTGLGI 248
Smar 1471
Smar MCM2
Tac-ATCC25905 VMA None None None None
Tac-DSM1728 VMA None None None None
Tag Pol-1 (Tsp-TY Pol-1) LSGIILAEG 126 LKNIESLY 213 VLRGFFERDA 226 TNNKWKIDIVAKLLDSLGI 259
Tag Pol-2 (Tsp-TY Pol-2) FLGYYVSEG 290 NKRIPSII 365 FLRAYFVGDG 385 TKSELLANQLVFLLNSLGV 415
Tag Pol-3 (Tsp-TY Pol-3) None None None None
Tfu Pol-1 LAGIILAEG 126 VREIMDGI 209 VLRGFFEGDG 224 TNNEWKIEVVSKLLNKLGI 259
Tfu Pol-2 LIGLLVGDG 155 NKAIPSFM 233 FLRGLFSADG 253 SNSLFTETKPNRYLEKESG 305
Thy Pol-1 LLGYYVSEG 289 NKRVPEAI 364 FIEGYFIGDG 384 TKSELLVNGLVLLLNSLGV 414
Thy Pol-2 LIGLLVGDG 155 SKRIPEFM 234 FLRGLFSADG 253 SVNPELSSSVRKLLWLVGV 286
Tko CDC21-1 None None None None
Tko CDC21-2
Tko Helicase
Tko IF2 FAGVMFGDG 220 FLRGYFDADG 302 SASQEFLEDLSLLLLRFGI 342
Tko KlbA VAGVILGDG 202 IWDVPDVV 280 FIAGLFDADG 299 TKSESAARKIWYALQRLGI 330
Tko LHR LLGFWMADG 215 KLGTFPSI 289 FLAGYFDGDG 309 TFNKRFAEGIRDILLQLGI 343
Tko Pol-1 (Pko Pol-1) LAGILLAEG 126 VKEIMDNI 209 VLRGFFEGDG 226 TKNEWKIKLVSKLLSQLGI 259
Tko Pol-2 (Pko Pol-2) LLGYYVSEG 289 NKRIPEFV 364 FLEGYFIGDG 384 NEKRALANQLVLLLNSVGV 413
Tko Pol-II
Tko RFC WLGYFIGDG 132 KVYIPEKG 202 FLRAYFDCDA 225 TASREMAEQVTYALAGFGI 254
Tko RIR1-1 LAGFIAGDG 150 TNGIPQPI 231 FITGLFDAEG 251 SKPGVELGMVNRKLIEDIT 273
Tko RIR1-2 VLGWFIGDG 157 EKRVPEII 230 FLRGLFSADG 250 SKDRGLLRDVQDLLLLFGI 280
Tko RadA
Tko TopA LFGLVAGDG 221 FLAGYYDADG 314 SKNRMAIYTVKQMWQLLGV 353
Tko r-Gyr VFGLVLGDG 208 GKLHPLVF 271 MIAGYFDTDG 290 SKRGDVLRMLSVYLYQIGI 327
Tli Pol-1 LLGYYVSEG 290 NKRIPSVI 365 FLEAYFTGDG 385 TKSELLANQLVFLLNSLGI 415
Tli Pol-2 LVGLIVGDG 156 RRKIPEFM 234 FLRGLFSADG 254 NIDADFLREVRKLLWIVGI 287
Ton-NA1 Pol LLGYYVSEG 287 NKRVPEVI 362 FLEGYFIGDG 382 TKSETLVNGLIILLNSLGI 412
Tpe Pol LVGLLVGDG 156 NKKIPEFM 234 FLRGLFSADG 254 TISDRLASDVRKLLWLVGI 287
Tsp-GE8 Pol-1 LLGYYVSEG 287 NKRVPEVI 362 FFEGYFIGDG 382 TKSEELVNGLVVLLNSLGI 412
Tsp-GE8 Pol-2 LIGLLVGDG 155 NKRIPSFM 233 FLRGLFSADG 254 SVNRELSNEVRKLLWLVGV 286
Tsp-GT Pol-1 LLGYYVSEG 289 NKRVPEAI 364 FIEGYFIGDG 385 TKSELLVNGLVLLLNSLGV 414
Tsp-GT Pol-2 LIGLLVGDG 155 SKRIPEFM 233 FLRGLFSADG 253 SVNPELSSSVRKLLWLVGV 286
Tsp-OGL-20P Pol LLGYYVSEG 289 NKRVPEVV 364 FLGGYFIGDG 384 TKSELLVNGLVLLLNSLGI 414
Tthi Pol LLGYYVSEG 289 NKRVPEVV 364 FLGGYFIGDG 384 TKSELLVNGLVLLLNSLGI 414
Tvo VMA None None None None
Tzi Pol LLSYYVSEG 287 NKRVPEII 362 FLEGCFIGDG 382 TKSEELVNGLVILLNSLGV 412
Unc-ERS PFL LLGYYLSEG 215 NKKLPTEF 297 LIIGLFRGDG 317 ITSKLLRYQISLILLRLGI 352
Unc-ERS RIR1
Unc-ERS RNR WLGLLLSEG 249 NKSVPDFM 319 FIRGYHAGDG 339 TISEGILRFLRYAFLILGV 366
Unc-MetRFS MCM2 None None None None |
CONSENSUS LINE KEY:
| h | hydrophobic residues (G,V,L,I,A,M) |
| a | acidic residues (D,E) |
| r | aromatic residues (F,Y,W) |
| p | polar residues (S,T,C) |
| / | to align block, 1 or more AA not shown |
| - | motif absent |
| . | non-conserved residue |
| * | gap introduced into Block F |
| underlined residues | conserved in almost all inteins |
| capital letters | single letter amino acid code |
|