Distribution of restriction sites in the human genome

Enzyme:  BcgI               Longest uncut segments
Specificity:  CGANNNNNNTGC               Repeats in uncut segments
Number of sites:  320289               Genes in uncut segments
Mean distance between sites:  8933 base pairs
Standard deviation:  11966 base pairs
Site density 111.9 per megabase               Help


Distribution of closely spaced sites

Distribution of sites within 7 STD distance


Help
Longest uncut segments
# Length  Chr  Scaffold  Coordinates  Repeat content Gene content
1   557889  chr15  NT_037852.6  1397326-1955215    5.40 % in   123 repeats    3.26 % in 2 genes
2   423271  chr6  NT_167244.1  2351962-2775233    2.70 % in   52 repeats    0.00 % in 0 genes
3   250740  chr6  NT_167244.1  2004845-2255585    2.06 % in   24 repeats    3.02 % in 3 genes
4   244816  chr6  NT_167244.1  3748790-3993606    13.62 % in   105 repeats    5.34 % in 1 genes
5   236144  chr6  NT_167244.1  4382189-4618333    8.11 % in   54 repeats    0.62 % in 1 genes
6   233303  chr8  NT_167187.1  31386743-31620046    99.26 % in   76 repeats    0.00 % in 0 genes
7   219778  chr6  NT_167248.1  496395-716173    16.03 % in   73 repeats    2.41 % in 4 genes
8   215426  chr4  NT_016354.19  25681604-25897030    51.17 % in   313 repeats    15.23 % in 1 genes
9   212746  chr5  NT_034772.6  11729661-11942407    50.41 % in   318 repeats    0.00 % in 0 genes
10   209654  chr6  NT_167247.1  4386355-4596009    7.99 % in   43 repeats    0.00 % in 0 genes
11   189348  chrX  NT_011651.17  31353935-31543283    82.37 % in   262 repeats    0.00 % in 0 genes
12   187116  chr4  NT_016297.16  1945985-2133101    53.52 % in   323 repeats    0.00 % in 0 genes
13   186017  chr13  NT_024524.14  56087375-56273392    45.65 % in   320 repeats    0.00 % in 0 genes
14   183566  chr4  NT_016354.19  17105114-17288680    67.23 % in   285 repeats    0.00 % in 0 genes
15   183384  chr6  NT_007299.13  13397617-13581001    51.43 % in   292 repeats    0.00 % in 0 genes
16   181323  chr6  NT_167249.1  2126579-2307902    5.23 % in   43 repeats    0.00 % in 0 genes


Help
Repeats in longest uncut segments
# Length  Chr  Scaffold  Coordinates  Repeats
Total  Distinct    Most  Second  Third 
557889  chr15  NT_037852.6  1397326-1955215    123  60       AT_rich (12)  AluJb (7)  L2a (6) 
423271  chr6  NT_167244.1  2351962-2775233    52  31       AluY (4)  AluJb (4)  L4 (3) 
250740  chr6  NT_167244.1  2004845-2255585    24  17       AluSx (4)  MIRb (2)  MIR (2) 
244816  chr6  NT_167244.1  3748790-3993606    105  58       L2a (11)  AT_rich (7)  L1M5 (5) 
236144  chr6  NT_167244.1  4382189-4618333    54  30       AluSx (7)  Harlequin-int (5)  AluY (5) 
233303  chr8  NT_167187.1  31386743-31620046    76  22       ALR/Alpha (39)  LTR14C (5)  AluY (5) 
219778  chr6  NT_167248.1  496395-716173    73  54       AT_rich (8)  L2c (3)  L2b (3) 
215426  chr4  NT_016354.19  25681604-25897030    313  132       AT_rich (26)  L2a (19)  MIRb (15) 
212746  chr5  NT_034772.6  11729661-11942407    318  118       AT_rich (44)  AluSx (14)  L1MEg (13) 
10  209654  chr6  NT_167247.1  4386355-4596009    43  29       L1PB1 (5)  MIR (3)  L2b (3) 
11  189348  chrX  NT_011651.17  31353935-31543283    262  118       L1MA3 (13)  L1PB1 (9)  AT_rich (9) 
12  187116  chr4  NT_016297.16  1945985-2133101    323  125       AT_rich (44)  L2a (18)  (TA)n (13) 
13  186017  chr13  NT_024524.14  56087375-56273392    320  141       AT_rich (26)  MIRb (13)  L2c (13) 
14  183566  chr4  NT_016354.19  17105114-17288680    285  126       AT_rich (23)  MIR (8)  L1M5 (8) 
15  183384  chr6  NT_007299.13  13397617-13581001    292  122       AT_rich (35)  L2a (16)  MIRb (15) 
16  181323  chr6  NT_167249.1  2126579-2307902    43  21       Charlie2b (6)  AluSx (6)  MamGypLTR1b (3) 


Help
Genes in longest uncut segments
Sgmnt   Length (bp)  Chr  Scaffold  Coordinates  Gene symbol  Gene function 
1   557889       chr15  NT_037852.6  1397326-1955215    LOC100418897 
LOC646214  p21_protein_(Cdc42/Rac)-activated_kinase_2_pseudogene
3   250740       chr6  NT_167244.1  2004845-2255585    LOC100294090  hypothetical_LOC100294090,_transcript_variant_1
FLOT1  flotillin-1
DDR1  epithelial_discoidin_domain-containing_receptor_1_isoform_DDR1c
4   244816       chr6  NT_167244.1  3748790-3993606    HLA-DRB3  major_histocompatibility_complex,_class_II,_DR_beta_3_precursor
5   236144       chr6  NT_167244.1  4382189-4618333    HLA-DPB2  major_histocompatibility_complex,_class_II,_DP_beta_2_(pseudogene)
7   219778       chr6  NT_167248.1  496395-716173    OR2G1P 
OR12D1P 
OR11A1  olfactory_receptor_11A1
OR10C1  olfactory_receptor_10C1
8   215426       chr4  NT_016354.19  25681604-25897030    EMCN  endomucin_isoform_2



Posfai@neb.com
May 11, 2011