Distribution of restriction sites in the human genome

Enzyme:  BaeI               Longest uncut segments
Specificity:  ACNNNNGTAYC               Repeats in uncut segments
Number of sites:  260650               Genes in uncut segments
Mean distance between sites:  10977 base pairs
Standard deviation:  11478 base pairs
Site density 91.1 per megabase               Help


Distribution of closely spaced sites

Distribution of sites within 7 STD distance


Help
Longest uncut segments
# Length  Chr  Scaffold  Coordinates  Repeat content Gene content
1   512907  chr15  NT_037852.6  1386842-1899749    2.71 % in   45 repeats    1.30 % in 1 genes
2   411280  chr6  NT_167244.1  2352224-2763504    1.38 % in   26 repeats    0.00 % in 0 genes
3   266875  chr6  NT_167244.1  1992540-2259415    3.99 % in   53 repeats    6.44 % in 4 genes
4   237608  chr12  NT_029419.12  299582-537190    99.75 % in   57 repeats    0.00 % in 0 genes
5   216185  chr6  NT_167247.1  4393365-4609550    9.00 % in   60 repeats    89.84 % in 4 genes
6   210252  chr6  NT_167244.1  4388316-4598568    1.11 % in   8 repeats    0.00 % in 0 genes
7   200507  chr6  NT_167244.1  3771815-3972322    3.59 % in   36 repeats    6.52 % in 1 genes
8   197407  chr6  NT_167244.1  3163991-3361398    5.58 % in   63 repeats    10.86 % in 2 genes
9   196261  chr6  NT_167248.1  515468-711729    10.96 % in   45 repeats    0.00 % in 0 genes
10   189932  chr9  NT_008470.19  21693281-21883213    7.04 % in   58 repeats    0.00 % in 0 genes
11   189626  chr6  NT_167249.1  2135619-2325245    10.02 % in   66 repeats    0.00 % in 0 genes
12   177922  chr6  NT_167247.1  1562046-1739968    5.19 % in   31 repeats    0.00 % in 0 genes
13   175541  chr7  NT_007933.15  68165749-68341290    26.64 % in   121 repeats    0.00 % in 0 genes
14   175034  chr11  NT_009237.18  50500225-50675259    99.80 % in   27 repeats    0.00 % in 0 genes
15   173941  chr4  NT_006316.16  378800-552741    6.29 % in   68 repeats    0.00 % in 0 genes
16   164066  chr2  NT_022184.15  66435939-66600005    28.62 % in   258 repeats    0.00 % in 0 genes


Help
Repeats in longest uncut segments
# Length  Chr  Scaffold  Coordinates  Repeats
Total  Distinct    Most  Second  Third 
512907  chr15  NT_037852.6  1386842-1899749    45  29       L1MDa (6)  (TA)n (3)  L2a (3) 
411280  chr6  NT_167244.1  2352224-2763504    26  17       AluJb (4)  L1ME4a (3)  MLT2D (2) 
266875  chr6  NT_167244.1  1992540-2259415    53  30       AluSx (7)  L2c (4)  MIRb (3) 
237608  chr12  NT_029419.12  299582-537190    57  13       ALR/Alpha (32)  L1PA3 (7)  L1PA4 (3) 
216185  chr6  NT_167247.1  4393365-4609550    60  39       L1PB1 (5)  MIR (3)  L2b (3) 
210252  chr6  NT_167244.1  4388316-4598568    7       MER57-int (2)  (TTCC)n (1)  L1MC (1) 
200507  chr6  NT_167244.1  3771815-3972322    36  26       AT_rich (4)  MIR (3)  L2a (3) 
197407  chr6  NT_167244.1  3163991-3361398    63  31       AluSx (9)  L1MC5 (6)  L1MB3 (4) 
196261  chr6  NT_167248.1  515468-711729    45  31       AT_rich (7)  L2c (3)  L2b (3) 
10  189932  chr9  NT_008470.19  21693281-21883213    58  44       L2 (5)  MIR (3)  Tigger1 (2) 
11  189626  chr6  NT_167249.1  2135619-2325245    66  31       AluSx (8)  Charlie2b (6)  L2a (4) 
12  177922  chr6  NT_167247.1  1562046-1739968    31  20       L1PB2 (4)  L1MEf (3)  MSTB (2) 
13  175541  chr7  NT_007933.15  68165749-68341290    121  65       AluSx (7)  L1ME3 (6)  L1M5 (6) 
14  175034  chr11  NT_009237.18  50500225-50675259    27  7       ALR/Alpha (13)  L1PA3 (6)  L1PA4 (3) 
15  173941  chr4  NT_006316.16  378800-552741    68  15       (CA)n (48)  L1M4 (7)  (TTA)n (1) 
16  164066  chr2  NT_022184.15  66435939-66600005    258  63       AluSg/x (30)  (CAAGC)n (25)  (CAGC)n (24) 


Help
Genes in longest uncut segments
Sgmnt   Length (bp)  Chr  Scaffold  Coordinates  Gene symbol  Gene function 
1   512907       chr15  NT_037852.6  1386842-1899749    LOC100418897 
3   266875       chr6  NT_167244.1  1992540-2259415    MDC1  mediator_of_DNA_damage_checkpoint_protein_1
LOC100294090  hypothetical_LOC100294090,_transcript_variant_1
FLOT1  flotillin-1
DDR1  epithelial_discoidin_domain-containing_receptor_1_isoform_DDR1c
5   216185       chr6  NT_167247.1  4393365-4609550    HLA-DPA2 
COL11A2P 
LOC100507722  hypothetical_protein_LOC100507722
COL11A2  collagen_alpha-2(XI)_chain_isoform_4_precursor
7   200507       chr6  NT_167244.1  3771815-3972322    HLA-DRB3  major_histocompatibility_complex,_class_II,_DR_beta_3_precursor
8   197407       chr6  NT_167244.1  3163991-3361398    EHMT2  histone-lysine_N-methyltransferase,_H3_lysine-9_specific_3_isoform_b
TNXB  tenascin-X_isoform_1_precursor



Posfai@neb.com
May 11, 2011