Distribution of restriction sites in the human genome

Enzyme:  HaeIV               Longest uncut segments
Specificity:  GAYNNNNNRTC               Repeats in uncut segments
Number of sites:  1363636               Genes in uncut segments
Mean distance between sites:  2098 base pairs
Standard deviation:  2175 base pairs
Site density 476.6 per megabase               Help


Distribution of closely spaced sites

Distribution of sites within 7 STD distance


Help
Longest uncut segments
# Length  Chr  Scaffold  Coordinates  Repeat content Gene content
1   493181  chr15  NT_037852.6  1393577-1886758    0.53 % in   11 repeats    0.00 % in 0 genes
2   408465  chr6  NT_167244.1  2354695-2763160    0.94 % in   18 repeats    0.00 % in 0 genes
3   211812  chr6  NT_167244.1  4389846-4601658    1.35 % in   11 repeats    0.00 % in 0 genes
4   188880  chr6  NT_167244.1  3790048-3978928    2.79 % in   23 repeats    0.00 % in 0 genes
5   184625  chr6  NT_167247.1  4414134-4598759    3.01 % in   21 repeats    98.39 % in 1 genes
6   184522  chr6  NT_167244.1  3179730-3364252    2.18 % in   20 repeats    4.64 % in 2 genes
7   172795  chr6  NT_167247.1  1556910-1729705    2.71 % in   24 repeats    3.50 % in 1 genes
8   171784  chr6  NT_167249.1  2135765-2307549    3.13 % in   25 repeats    0.00 % in 0 genes
9   162339  chr6  NT_167248.1  520336-682675    1.87 % in   2 repeats    0.00 % in 0 genes
10   154809  chr9  NT_008470.19  21688861-21843670    2.33 % in   9 repeats    0.00 % in 0 genes
11   144440  chr6  NT_167244.1  2893873-3038313    0.84 % in   7 repeats    0.00 % in 0 genes
12   126796  chr1  NT_077389.3  262939-389735    98.71 % in   58 repeats    0.00 % in 0 genes
13   119654  chr1  NT_004350.19  2058104-2177758    3.78 % in   9 repeats    0.00 % in 0 genes
14   119622  chr6  NT_167245.1  2604767-2724389    1.66 % in   5 repeats    0.00 % in 0 genes
15   119478  chr6  NT_167247.1  1173244-1292722    3.59 % in   7 repeats    0.00 % in 0 genes
16   116508  chr6  NT_167246.1  3259727-3376235    0.88 % in   7 repeats    0.00 % in 0 genes


Help
Repeats in longest uncut segments
# Length  Chr  Scaffold  Coordinates  Repeats
Total  Distinct    Most  Second  Third 
493181  chr15  NT_037852.6  1393577-1886758    11  9       L1MDa (3)  MIRc (1)  MIRb (1) 
408465  chr6  NT_167244.1  2354695-2763160    18  11       L1ME4a (3)  AluJb (3)  MLT2D (2) 
211812  chr6  NT_167244.1  4389846-4601658    11  8       HERVH-int (2)  AluSx (2)  AluSg/x (2) 
188880  chr6  NT_167244.1  3790048-3978928    23  18       L2a (4)  MLT1H-int (2)  L1M5 (2) 
184625  chr6  NT_167247.1  4414134-4598759    21  17       AluSx (3)  MLT1J (2)  L2b (2) 
184522  chr6  NT_167244.1  3179730-3364252    20  10       AluSx (4)  MIR (3)  GC_rich (3) 
172795  chr6  NT_167247.1  1556910-1729705    24  18       L2c (3)  Tigger7 (2)  MSTD (2) 
171784  chr6  NT_167249.1  2135765-2307549    25  12       Charlie2b (6)  AluSx (5)  L1MB8 (3) 
162339  chr6  NT_167248.1  520336-682675    2       L1PREC2 (1)  HERVH-int (1) 
10  154809  chr9  NT_008470.19  21688861-21843670    7       LTR67B (2)  L1M4b (2)  MSTA (1) 
11  144440  chr6  NT_167244.1  2893873-3038313    6       AluJo (2)  L1MC5 (1)  AluY (1) 
12  126796  chr1  NT_077389.3  262939-389735    58  6       ALR/Alpha (52)  MLT1J (2)  L2a (1) 
13  119654  chr1  NT_004350.19  2058104-2177758    5       L1MB3 (4)  AluSg (2)  MER8 (1) 
14  119622  chr6  NT_167245.1  2604767-2724389    5       MLT1E2 (1)  MER5B (1)  MER5A1 (1) 
15  119478  chr6  NT_167247.1  1173244-1292722    4       L2 (3)  ERV3-16A3_I-int (2)  MLT1E2 (1) 
16  116508  chr6  NT_167246.1  3259727-3376235    5       MIRb (2)  AluSx (2)  MIR3 (1) 


Help
Genes in longest uncut segments
Sgmnt   Length (bp)  Chr  Scaffold  Coordinates  Gene symbol  Gene function 
5   184625       chr6  NT_167247.1  4414134-4598759    LOC100507722  hypothetical_protein_LOC100507722
6   184522       chr6  NT_167244.1  3179730-3364252    EHMT2  histone-lysine_N-methyltransferase,_H3_lysine-9_specific_3_isoform_b
TNXB  tenascin-X_isoform_1_precursor
7   172795       chr6  NT_167247.1  1556910-1729705    LOC100421582  tripartite_motif-containing_protein_26



Posfai@neb.com
May 11, 2011