Distribution of restriction sites in the human genome

Enzyme:  NheI               Longest uncut segments
Specificity:  GCTAGC               Repeats in uncut segments
Number of sites:  281049               Genes in uncut segments
Mean distance between sites:  10180 base pairs
Standard deviation:  11291 base pairs
Site density 98.2 per megabase               Help


Distribution of closely spaced sites

Distribution of sites within 7 STD distance


Help
Longest uncut segments
# Length  Chr  Scaffold  Coordinates  Repeat content Gene content
1   520777  chr15  NT_037852.6  1385377-1906154    3.38 % in   58 repeats    1.92 % in 1 genes
2   441596  chr6  NT_167244.1  2345219-2786815    5.04 % in   91 repeats    1.42 % in 1 genes
3   257057  chr6  NT_167244.1  2008967-2266024    3.21 % in   39 repeats    2.99 % in 3 genes
4   250512  chr6  NT_167244.1  4352103-4602615    9.95 % in   55 repeats    7.92 % in 2 genes
5   223543  chr6  NT_167244.1  3140303-3363846    9.18 % in   110 repeats    20.28 % in 4 genes
6   221249  chr6  NT_167244.1  3766627-3987876    8.59 % in   72 repeats    5.90 % in 1 genes
7   213870  chr6  NT_167246.1  3168607-3382477    9.75 % in   100 repeats    15.76 % in 4 genes
8   204922  chr6  NT_167247.1  4402157-4607079    7.11 % in   45 repeats    92.68 % in 3 genes
9   204753  chr7  NT_023603.5  32821-237574    99.97 % in   11 repeats    0.00 % in 0 genes
10   192489  chr6  NT_167249.1  4665743-4858232    29.73 % in   271 repeats    0.00 % in 0 genes
11   181716  chr6  NT_167249.1  2133546-2315262    6.02 % in   51 repeats    0.00 % in 0 genes
12   178863  chr6  NT_167244.1  2885727-3064590    8.31 % in   79 repeats    0.00 % in 0 genes
13   173459  chr6  NT_167247.1  1559865-1733324    2.70 % in   20 repeats    0.00 % in 0 genes
14   166237  chr6  NT_167248.1  517732-683969    4.17 % in   2 repeats    0.00 % in 0 genes
15   161033  chr12  NT_029419.12  38067-199100    81.79 % in   219 repeats    0.00 % in 0 genes
16   155913  chr19  NT_011109.16  28260532-28416445    52.27 % in   427 repeats    0.00 % in 0 genes


Help
Repeats in longest uncut segments
# Length  Chr  Scaffold  Coordinates  Repeats
Total  Distinct    Most  Second  Third 
520777  chr15  NT_037852.6  1385377-1906154    58  36       L1MDa (6)  AT_rich (5)  Tigger2 (3) 
441596  chr6  NT_167244.1  2345219-2786815    91  49       AluY (7)  AluJb (5)  AT_rich (4) 
257057  chr6  NT_167244.1  2008967-2266024    39  25       MIR (5)  MIRb (3)  L1MEe (3) 
250512  chr6  NT_167244.1  4352103-4602615    55  34       HUERS-P3-int (7)  L1PB1 (4)  MER57-int (3) 
223543  chr6  NT_167244.1  3140303-3363846    110  41       AluSx (14)  MIR (8)  L1MC5 (8) 
221249  chr6  NT_167244.1  3766627-3987876    72  47       L2a (9)  L1M5 (5)  AT_rich (4) 
213870  chr6  NT_167246.1  3168607-3382477    100  45       AluSx (18)  AT_rich (5)  AluSg (5) 
204922  chr6  NT_167247.1  4402157-4607079    45  30       L1PB1 (4)  L2b (3)  AluSx (3) 
204753  chr7  NT_023603.5  32821-237574    11  4       ALR/Alpha (5)  L1PA2 (4)  L1PA3 (1) 
10  192489  chr6  NT_167249.1  4665743-4858232    271  69       AluSx (28)  AluJb (19)  AluY (17) 
11  181716  chr6  NT_167249.1  2133546-2315262    51  28       Charlie2b (6)  AluSx (6)  L1MB8 (3) 
12  178863  chr6  NT_167244.1  2885727-3064590    79  36       AluY (7)  AluJb (7)  L1MC5 (6) 
13  173459  chr6  NT_167247.1  1559865-1733324    20  15       Tigger7 (2)  MSTB (2)  MIR (2) 
14  166237  chr6  NT_167248.1  517732-683969    2       L1PREC2 (1)  HERVH-int (1) 
15  161033  chr12  NT_029419.12  38067-199100    219  80       AluSx (33)  SST1 (10)  AluJb (10) 
16  155913  chr19  NT_011109.16  28260532-28416445    427  100       AluSx (40)  AluJo (34)  MIR (29) 


Help
Genes in longest uncut segments
Sgmnt   Length (bp)  Chr  Scaffold  Coordinates  Gene symbol  Gene function 
1   520777       chr15  NT_037852.6  1385377-1906154    LOC100418897 
2   441596       chr6  NT_167244.1  2345219-2786815    MICB  MHC_class_I_polypeptide-related_sequence_B_precursor
3   257057       chr6  NT_167244.1  2008967-2266024    FLOT1  flotillin-1
DDR1  epithelial_discoidin_domain-containing_receptor_1_isoform_DDR1c
MUC21  mucin-21_precursor
4   250512       chr6  NT_167244.1  4352103-4602615    COL11A2P 
HLA-DPB2  major_histocompatibility_complex,_class_II,_DP_beta_2_(pseudogene)
5   223543       chr6  NT_167244.1  3140303-3363846    NEU1  sialidase-1_precursor
SLC44A4  choline_transporter-like_protein_4_isoform_3
EHMT2  histone-lysine_N-methyltransferase,_H3_lysine-9_specific_3_isoform_b
TNXB  tenascin-X_isoform_1_precursor
6   221249       chr6  NT_167244.1  3766627-3987876    HLA-DRB3  major_histocompatibility_complex,_class_II,_DR_beta_3_precursor
7   213870       chr6  NT_167246.1  3168607-3382477    NEU1  sialidase-1_precursor
C2  complement_C2_isoform_3
CFB  complement_factor_B_preproprotein_preproprotein
TNXB  tenascin-X_isoform_1_precursor
8   204922       chr6  NT_167247.1  4402157-4607079    COL11A2P 
LOC100507722  hypothetical_protein_LOC100507722
COL11A2  collagen_alpha-2(XI)_chain_isoform_4_precursor



Posfai@neb.com
May 11, 2011