Distribution of restriction sites in the human genome

Enzyme:  BssSI               Longest uncut segments
Specificity:  CACGAG               Repeats in uncut segments
Number of sites:  387590               Genes in uncut segments
Mean distance between sites:  7382 base pairs
Standard deviation:  8789 base pairs
Site density 135.5 per megabase               Help


Distribution of closely spaced sites

Distribution of sites within 7 STD distance


Help
Longest uncut segments
# Length  Chr  Scaffold  Coordinates  Repeat content Gene content
1   496233  chr15  NT_037852.6  1395271-1891504    0.88 % in   20 repeats    0.00 % in 0 genes
2   408682  chr6  NT_167244.1  2355717-2764399    0.90 % in   18 repeats    0.00 % in 0 genes
3   248639  chr6  NT_167244.1  2008429-2257068    1.76 % in   19 repeats    2.15 % in 2 genes
4   217514  chr6  NT_167244.1  4383069-4600583    2.46 % in   19 repeats    0.27 % in 1 genes
5   208607  chr6  NT_167244.1  3787585-3996192    7.43 % in   58 repeats    0.00 % in 0 genes
6   187291  chr6  NT_167247.1  4408415-4595706    2.63 % in   19 repeats    97.08 % in 2 genes
7   181683  chr6  NT_167244.1  3179906-3361589    1.91 % in   17 repeats    3.15 % in 2 genes
8   177838  chr6  NT_167248.1  514981-692819    7.47 % in   14 repeats    0.64 % in 1 genes
9   177149  chrY  NT_011875.12  8532871-8710020    71.69 % in   10 repeats    0.00 % in 0 genes
10   176061  chr6  NT_167249.1  2135279-2311340    4.06 % in   35 repeats    0.00 % in 0 genes
11   175514  chr6  NT_167247.1  1556537-1732051    3.36 % in   30 repeats    0.00 % in 0 genes
12   169405  chr9  NT_008470.19  21678725-21848130    5.76 % in   36 repeats    0.00 % in 0 genes
13   164151  chr4  NT_006316.16  396194-560345    5.29 % in   57 repeats    0.00 % in 0 genes
14   150722  chr7  NT_077528.2  21043-171765    66.68 % in   26 repeats    0.00 % in 0 genes
15   146501  chr6  NT_167244.1  2891008-3037509    1.82 % in   11 repeats    0.00 % in 0 genes
16   137657  chr4  NT_016354.19  61595898-61733555    51.14 % in   196 repeats    0.00 % in 0 genes


Help
Repeats in longest uncut segments
# Length  Chr  Scaffold  Coordinates  Repeats
Total  Distinct    Most  Second  Third 
496233  chr15  NT_037852.6  1395271-1891504    20  17       L2a (3)  L1M5 (2)  U2 (1) 
408682  chr6  NT_167244.1  2355717-2764399    18  12       AluJb (3)  MLT2D (2)  L4 (2) 
248639  chr6  NT_167244.1  2008429-2257068    19  16       MIRb (2)  MIR (2)  AluSx (2) 
217514  chr6  NT_167244.1  4383069-4600583    19  13       MER57-int (3)  AluSx (3)  AluY (2) 
208607  chr6  NT_167244.1  3787585-3996192    58  41       L2a (6)  L1P3 (3)  L1MEc (3) 
187291  chr6  NT_167247.1  4408415-4595706    19  15       L2b (3)  (TGGA)n (2)  MIRb (2) 
181683  chr6  NT_167244.1  3179906-3361589    17  9       AluSx (4)  MIRb (2)  MIR (2) 
177838  chr6  NT_167248.1  514981-692819    14  11       AT_rich (4)  MLT1G3 (1)  LTR7 (1) 
177149  chrY  NT_011875.12  8532871-8710020    10  1       LTR12B (10) 
10  176061  chr6  NT_167249.1  2135279-2311340    35  18       Charlie2b (6)  AluSx (5)  L1MB8 (3) 
11  175514  chr6  NT_167247.1  1556537-1732051    30  24       L2c (3)  Tigger7 (2)  MSTD (2) 
12  169405  chr9  NT_008470.19  21678725-21848130    36  24       MIRb (3)  L1M5 (3)  AluSq (3) 
13  164151  chr4  NT_006316.16  396194-560345    57  7       (CA)n (44)  L1M4 (7)  L1PA10 (2) 
14  150722  chr7  NT_077528.2  21043-171765    26  7       ALR/Alpha (16)  L1PA4 (3)  L1P1 (2) 
15  146501  chr6  NT_167244.1  2891008-3037509    11  8       AluY (4)  (TG)n (1)  (TCC)n (1) 
16  137657  chr4  NT_016354.19  61595898-61733555    196  90       AT_rich (29)  (TA)n (8)  AluY (8) 


Help
Genes in longest uncut segments
Sgmnt   Length (bp)  Chr  Scaffold  Coordinates  Gene symbol  Gene function 
3   248639       chr6  NT_167244.1  2008429-2257068    FLOT1  flotillin-1
DDR1  epithelial_discoidin_domain-containing_receptor_1_isoform_DDR1c
4   217514       chr6  NT_167244.1  4383069-4600583    HLA-DPB2  major_histocompatibility_complex,_class_II,_DP_beta_2_(pseudogene)
6   187291       chr6  NT_167247.1  4408415-4595706    COL11A2P 
LOC100507722  hypothetical_protein_LOC100507722
7   181683       chr6  NT_167244.1  3179906-3361589    EHMT2  histone-lysine_N-methyltransferase,_H3_lysine-9_specific_3_isoform_b
TNXB  tenascin-X_isoform_1_precursor
8   177838       chr6  NT_167248.1  514981-692819    OR12D1P 



Posfai@neb.com
May 11, 2011