Distribution of restriction sites in the human genome

Enzyme:  BsgI               Longest uncut segments
Specificity:  GTGCAG               Repeats in uncut segments
Number of sites:  2038395               Genes in uncut segments
Mean distance between sites:  1403 base pairs
Standard deviation:  1600 base pairs
Site density 712.4 per megabase               Help


Distribution of closely spaced sites

Distribution of sites within 7 STD distance


Help
Longest uncut segments
# Length  Chr  Scaffold  Coordinates  Repeat content Gene content
1   492845  chr15  NT_037852.6  1397383-1890228    0.64 % in   14 repeats    0.00 % in 0 genes
2   402679  chr6  NT_167244.1  2358895-2761574    0.19 % in   3 repeats    0.00 % in 0 genes
3   208728  chr6  NT_167244.1  4389978-4598706    0.39 % in   5 repeats    0.00 % in 0 genes
4   182462  chr6  NT_167244.1  3788488-3970950    0.60 % in   8 repeats    0.00 % in 0 genes
5   175765  chr6  NT_167244.1  3179971-3355736    0.24 % in   5 repeats    0.14 % in 1 genes
6   173426  chr6  NT_167247.1  4422093-4595519    0.73 % in   3 repeats    100.00 % in 1 genes
7   166639  chr6  NT_167247.1  1561578-1728217    0.78 % in   6 repeats    0.82 % in 1 genes
8   164946  chr6  NT_167249.1  2138415-2303361    0.05 % in   2 repeats    0.00 % in 0 genes
9   164688  chr4  NT_006316.16  389970-554658    4.09 % in   60 repeats    0.00 % in 0 genes
10   161992  chr6  NT_167248.1  521777-683769    1.66 % in   2 repeats    0.00 % in 0 genes
11   158130  chr6  NT_167244.1  2009188-2167318    0.47 % in   4 repeats    0.00 % in 0 genes
12   153608  chr9  NT_008470.19  21692296-21845904    1.02 % in   8 repeats    0.00 % in 0 genes
13   144239  chr6  NT_167244.1  2894510-3038749    0.89 % in   8 repeats    0.00 % in 0 genes
14   119248  chr6  NT_167245.1  2605633-2724881    1.44 % in   4 repeats    0.00 % in 0 genes
15   118220  chr6  NT_167247.1  1175821-1294041    1.56 % in   2 repeats    0.00 % in 0 genes
16   113833  chr6  NT_167246.1  3261028-3374861    0.12 % in   2 repeats    0.00 % in 0 genes


Help
Repeats in longest uncut segments
# Length  Chr  Scaffold  Coordinates  Repeats
Total  Distinct    Most  Second  Third 
492845  chr15  NT_037852.6  1397383-1890228    14  11       L2a (3)  L1M5 (2)  U2 (1) 
402679  chr6  NT_167244.1  2358895-2761574    3       L4 (1)  AluSp (1)  AluJb (1) 
208728  chr6  NT_167244.1  4389978-4598706    4       AluSx (2)  L1MC (1)  AluSg/x (1) 
182462  chr6  NT_167244.1  3788488-3970950    7       AT_rich (2)  MLT1H-int (1)  MIR (1) 
175765  chr6  NT_167244.1  3179971-3355736    4       GC_rich (2)  Charlie4a (1)  (CCG)n (1) 
173426  chr6  NT_167247.1  4422093-4595519    3       MER11A (1)  AluSg/x (1)  AluSc (1) 
166639  chr6  NT_167247.1  1561578-1728217    5       MIR (2)  L1MC3 (1)  (GGAA)n (1) 
164946  chr6  NT_167249.1  2138415-2303361    2       L1MB8 (1)  AluSx (1) 
164688  chr4  NT_006316.16  389970-554658    60  8       (CA)n (47)  L1M4 (7)  MER5B (1) 
10  161992  chr6  NT_167248.1  521777-683769    2       L1PREC2 (1)  HERVH-int (1) 
11  158130  chr6  NT_167244.1  2009188-2167318    4       MIRb (1)  MER5A1 (1)  L1MC4a (1) 
12  153608  chr9  NT_008470.19  21692296-21845904    6       LTR67B (2)  L2 (2)  MIRb (1) 
13  144239  chr6  NT_167244.1  2894510-3038749    6       L1MC5 (2)  AluJo (2)  AluY (1) 
14  119248  chr6  NT_167245.1  2605633-2724881    3       L2 (2)  MLT1E2 (1)  L2a (1) 
15  118220  chr6  NT_167247.1  1175821-1294041    1       ERV3-16A3_I-int (2) 
16  113833  chr6  NT_167246.1  3261028-3374861    2       MIRb (1)  AluSx (1) 


Help
Genes in longest uncut segments
Sgmnt   Length (bp)  Chr  Scaffold  Coordinates  Gene symbol  Gene function 
5   175765       chr6  NT_167244.1  3179971-3355736    EHMT2  histone-lysine_N-methyltransferase,_H3_lysine-9_specific_3_isoform_b
6   173426       chr6  NT_167247.1  4422093-4595519    LOC100507722  hypothetical_protein_LOC100507722
7   166639       chr6  NT_167247.1  1561578-1728217    LOC100421582  tripartite_motif-containing_protein_26



Posfai@neb.com
May 11, 2011