Distribution of restriction sites in the human genome

Enzyme:  MscI               Longest uncut segments
Specificity:  TGGCCA               Repeats in uncut segments
Number of sites:  1296892               Genes in uncut segments
Mean distance between sites:  2206 base pairs
Standard deviation:  2513 base pairs
Site density 453.2 per megabase               Help


Distribution of closely spaced sites

Distribution of sites within 7 STD distance


Help
Longest uncut segments
# Length  Chr  Scaffold  Coordinates  Repeat content Gene content
1   490107  chr15  NT_037852.6  1397710-1887817    0.16 % in   6 repeats    0.00 % in 0 genes
2   404820  chr6  NT_167244.1  2357901-2762721    0.45 % in   9 repeats    0.00 % in 0 genes
3   246260  chr6  NT_167244.1  2008701-2254961    0.95 % in   12 repeats    2.06 % in 2 genes
4   214151  chr6  NT_167244.1  4386536-4600687    2.42 % in   18 repeats    0.00 % in 0 genes
5   190341  chr6  NT_167244.1  3781997-3972338    2.51 % in   20 repeats    2.89 % in 1 genes
6   179744  chr6  NT_167244.1  3177021-3356765    1.10 % in   20 repeats    2.11 % in 2 genes
7   175231  chr6  NT_167247.1  4420635-4595866    1.02 % in   8 repeats    100.00 % in 1 genes
8   172157  chr6  NT_167247.1  1559921-1732078    2.14 % in   17 repeats    1.76 % in 1 genes
9   167458  chr6  NT_167249.1  2136066-2303524    1.08 % in   9 repeats    0.00 % in 0 genes
10   164028  chrY  NT_011875.12  8555055-8719083    69.38 % in   20 repeats    0.00 % in 0 genes
11   163483  chr6  NT_167248.1  521836-685319    2.55 % in   2 repeats    0.00 % in 0 genes
12   154823  chr7  NT_023603.5  39589-194412    100.00 % in   2 repeats    0.00 % in 0 genes
13   151166  chr9  NT_008470.19  21692803-21843969    0.35 % in   3 repeats    0.00 % in 0 genes
14   143435  chr6  NT_167244.1  2894047-3037482    0.21 % in   2 repeats    0.00 % in 0 genes
15   125472  chr1  NT_077389.3  264175-389647    99.30 % in   57 repeats    0.00 % in 0 genes
16   125092  chr14  NT_026437.12  196010-321102    99.29 % in   10 repeats    0.00 % in 0 genes


Help
Repeats in longest uncut segments
# Length  Chr  Scaffold  Coordinates  Repeats
Total  Distinct    Most  Second  Third 
490107  chr15  NT_037852.6  1397710-1887817    6       MLT1L (1)  MIRc (1)  MIRb (1) 
404820  chr6  NT_167244.1  2357901-2762721    7       L4 (2)  AluJb (2)  L1MEg (1) 
246260  chr6  NT_167244.1  2008701-2254961    12  10       MIRb (2)  MIR (2)  MER5A1 (1) 
214151  chr6  NT_167244.1  4386536-4600687    18  13       MER57-int (3)  AluSx (3)  AluSg/x (2) 
190341  chr6  NT_167244.1  3781997-3972338    20  14       L2a (3)  MLT1H-int (2)  AT_rich (2) 
179744  chr6  NT_167244.1  3177021-3356765    20  13       AluSx (4)  GC_rich (3)  MER44B (2) 
175231  chr6  NT_167247.1  4420635-4595866    8       (TTAAA)n (1)  MIR (1)  MER11A (1) 
172157  chr6  NT_167247.1  1559921-1732078    17  14       Tigger7 (2)  MIR (2)  L1MEe (2) 
167458  chr6  NT_167249.1  2136066-2303524    8       AluSx (2)  MLT1A (1)  MamGypLTR1b (1) 
10  164028  chrY  NT_011875.12  8555055-8719083    20  9       LTR12B (9)  L1PA16 (4)  (TATAA)n (1) 
11  163483  chr6  NT_167248.1  521836-685319    2       L1PREC2 (1)  HERVH-int (1) 
12  154823  chr7  NT_023603.5  39589-194412    2       L1PA2 (1)  ALR/Alpha (1) 
13  151166  chr9  NT_008470.19  21692803-21843969    3       MIR3 (1)  LTR67B (1)  L1M5 (1) 
14  143435  chr6  NT_167244.1  2894047-3037482    2       AluSg1 (1)  AluSc (1) 
15  125472  chr1  NT_077389.3  264175-389647    57  5       ALR/Alpha (52)  MLT1J (2)  L2 (1) 
16  125092  chr14  NT_026437.12  196010-321102    10  7       CER (4)  MER94 (1)  L1PA4 (1) 


Help
Genes in longest uncut segments
Sgmnt   Length (bp)  Chr  Scaffold  Coordinates  Gene symbol  Gene function 
3   246260       chr6  NT_167244.1  2008701-2254961    FLOT1  flotillin-1
DDR1  epithelial_discoidin_domain-containing_receptor_1_isoform_DDR1c
5   190341       chr6  NT_167244.1  3781997-3972338    HLA-DRB3  major_histocompatibility_complex,_class_II,_DR_beta_3_precursor
6   179744       chr6  NT_167244.1  3177021-3356765    EHMT2  histone-lysine_N-methyltransferase,_H3_lysine-9_specific_3_isoform_b
TNXB  tenascin-X_isoform_1_precursor
7   175231       chr6  NT_167247.1  4420635-4595866    LOC100507722  hypothetical_protein_LOC100507722
8   172157       chr6  NT_167247.1  1559921-1732078    LOC100421582  tripartite_motif-containing_protein_26



Posfai@neb.com
May 11, 2011