"Building promoter aware transcriptional regulatory networks using siRNA perturbation and deepCAGE".
Morana Vitezic 1, 2,*, Timo Lassmann 1,*, Alistair R. R. Forrest 1, Masanori Suzuki 1, Yasuhiro Tomaru 1, Jun Kawai 1, Piero Carninci 1, Harukazu Suzuki 1, Yoshihide Hayashizaki 1 and Carsten O. Daub 1
1 Omics Science Center, RIKEN Yokohama Institute, 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama 230-0045 Japan
2 Department of Cell and Molecular Biology (CMB),
Karolinska Institute, SE-171 77, Stockholm, Sweden
*To whom correspondence should be addressed. Tel: +81 45 503 9220; Fax: +81 45 503 9216; Email: mvitezic@gsc.riken.jp
Correspondence may also be addressed to Timo Lassmann. Tel: +81 45 503 9220; Fax: +81 45 503 9216; Email: lassmann@gsc.riken.jp
Received February 1, 2010. Revision received August
2, 2010. Accepted August 2, 2010.
Perturbation and time-course data sets, in combination with computational
approaches, can be used to infer transcriptional regulatory networks which
ultimately govern the developmental pathways and responses of cells. Here,
we individually knocked down the four transcription factors PU.1, IRF8,
MYB and SP1 in the human monocyte leukemia THP-1 cell line and profiled
the genome-wide transcriptional response of individual transcription starting
sites using deep sequencing based Cap Analysis of Gene Expression. From
the proximal promoter regions of the responding transcription starting
sites, we derived de novo binding-site motifs, characterized their
biological function and constructed a network. We found a previously described
composite motif for PU.1 and IRF8 that explains the overlapping set of
transcriptional responses upon knockdown of either factor.
INTRODUCTION:
The human genome project (1) and the subsequent annotation efforts (2,3) provided us a catalog of genes present in our genome. These efforts quickly gave rise to system approaches aiming at understanding the interactions between genes that ultimately govern phenotype and disease pathology (4). The complex interactions among transcription factors derived from such networks point to diverse regulatory programs responsible for cell differentiation during development and cellular responses to outside stimuli.
A powerful technique to understand gene regulatory networks is the perturbation of individual transcription factors in concert with high-throughput expression profiling of all genes (5). Commonly, microarrays are used to measure the changes in gene expression (6-8). In addition to defining regulatory interactions, transcription factor binding site (TFBS) motifs can be extracted from promoter regions of affected genes. Searching the genome sequence in silico with such motifs can reveal putative downstream targets of the transcription factors. However, these predictions are fraught with difficulties summarized by the futility theorem (9). In brief, most predicted binding sites will have no functional role in general and, despite binding in vitro, may not be functional in the cellular model studied or may only be functional in presence of additional factors (co-regulation). Therefore, it is desirable to couple computational approaches with experimental techniques to identify actively used TFBS.
Chromatin immunoprecipitation (ChIP) in conjunction with tiling microarrays or sequencing is able to tell us the possible binding sites of transcription factors. To be able to perform experiments for specific transcription factors, however, specific antibodies are needed whose production is both difficult and, for many of the transcription factors, not yet available (10). Additional specific experimental optimizations are required.
Here, we describe the use of deep sequencing based Cap Analysis of Gene Expression (deepCAGE) (11) to study the effects of transcription factor (TF) perturbations on target gene expression at the promoter level. Previously, deepCAGE was used to accurately define and compare the transcriptional start sites (TSS) of genes in various tissues (7), determine the distance of the TATA-box from the TSS (12), as well as during cell differentiation (3). Restricting TFBS analysis to the accurately mapped TSSs discards many false-positive predictions in intergenic regions and thus improves the accuracy of transcriptional regulatory networks (3). In contrast to previous approaches, this allows for the construction of transcriptional regulatory gene networks at the resolution of individual promoters.
In this study, we combined our deepCAGE (3,13) technology with knockdown (KD) perturbation experiments of four key transcription factors (PU.1, IRF8, MYB and SP1) expressed in the human monoblastic leukemia cell line THP-1 (14). Previously, we demonstrated by using siRNA-mediated gene knockdown and microarray profiling that these four factors regulate large numbers of genes important to monocyte biology. In particular, MYB knockdown promotes monocytic differentiation of THP-1 cells, indicating a central role in maintaining the undifferentiated monoblast state (3).
DeepCAGE profiles were generated for each of the samples and compared
to cells treated with a scrambled negative control oligo. This approach
allowed us to identify the most strongly affected TSSs for each TF knockdown
and their corresponding promoter regions. We then attempted to derive de
novo TFBS motifs from the promoter regions and compared our results
to the known binding-site models in the TRANSFAC database. Finally, these
data were used to draw a basic regulatory network based on the direct
regulatory interactions we identified.
MATERIALS AND METHODS:
Cell culture and knockdown experiments
We used RNA extracted from the same knockdown human leukemia THP-1 cell batches used in the recent FANTOM4 project (3, 8). In brief, transfection was performed using stealth siRNA (Invitrogen) and RNA was harvested after 48 hr. TF gene-expression levels in THP-1 cells treated with gene-specific siRNAs (SP1, PU.1, IRF8 and MYB) or the calibrator negative control (NC) siRNA were estimated by qRT-PCR in triplicate [see Supplementary material of Suzuki et al. (3)].
deepCAGE library generation, mapping and clustering of deepCAGE tagsdeep
CAGE libraries were prepared for the five knockdown experiments according to the deepCAGE protocol (3, 13) and sequenced using the Roche 454 sequencer. In total, 6187981 deepCAGE tags were mapped to the human reference genome sequence (hg18) using Nexalign (Lassmann,T., http://genome.gsc.riken.jp/osc/english/dataresource/) allowing up to one mismatch or one indel. Tags with TSS falling into windows of 20-bp were grouped into 396118 tag clusters (TCs). For all further analyses, we focused on a filtered set of 3332 robustly detected TCs with a minimum average deepCAGE expression across the five (four KD and control) libraries of 30 tags per million (TPM).
Comparison of deepCAGE and microarray expression
For comparing the perturbation of deepCAGE expression profiles with microarray expression, we first mapped the 3332 robustly detected TCs to Entrez gene models, requiring that the tags originated within the boundaries of known transcripts for the locus or up to 1 kb upstream. The 3332 TCs mapped to 3114 Entrez genes using this approach, with 84 genes possessing more than one robustly detected TC. Fold change for the deepCAGE data was then calculated by dividing the gene expression in TF KD by the expression in the negative control experiment. Microarray probe mapping to Entrez gene and expression fold changes were obtained as described in Suzuki et al. (3). This then allowed direct comparison of fold changes measured by deepCAGE with the corresponding measurement by microarray.
De novo motif prediction,
TFBS prediction and ChIP-chip data
Proximal promoter regions of TSSs were defined as previously described (3) and include 300 bp upstream and 100 bp downstream of the deepCAGE-defined TSS. We extracted the corresponding active deepCAGE promoter regions from the human genome (hg18) and applied the motif-finding program MEME (15). We applied MEME to regions which are at least 1.5-fold up- or downregulated in both microarray and deepCAGE measurement. The selection was further restricted to the top 50 of such regions based on recommendations found in Bailey et al. (15). We hypothesize that this selection enriches for promoters that are direct targets of the transcription factor. In the case of IRF8, SP1 and PU.1, fewer than 50 TCs were upregulated by at least 1.5-fold (20, 22 and 38, respectively); therefore, smaller training sets were used for these classes.
MEME can report multiple motifs for each set of the proximal promoter regions. In such cases, we only selected the motif with the most significant E-value for further analysis. We did not attempt to merge similar motifs.
To assess whether the obtained motifs are biologically relevant, we searched the remaining TCs (3332 TCs, excluding the training sets) using the program Fimo from the Meta-MEME package (16). For comparison, we used the TRANSFAC database and the accompanying Match program (17,18) to scan our sequences for the presence of TRANSFAC defined motifs. Furthermore, we overlaid our TCs with previously published ChIP-chip data (3) for PU.1 and SP1 (detailed Methods available in the Supplementary Data).
We used UCSC browser Vertebrate Multiz Alignment & PhastCons trac to look for conservation of our motifs. A base position in the motif was deemed to be conserved if the conservation was at least 80%.
Accession codes
DNA Data Bank of Japan (DDBJ) Read Archive: DRX000341 (CAGE library
I05).
RESULTS:
deepCAGE and microarray profiling of siRNA knockdowns identifies
overlapping sets of perturbed genesTo evaluate deepCAGE as a platform for
measuring gene-expression perturbation, we used the same batches of RNAs
for both TF suppression and negative control samples as were used in the
microarray analysis for the FANTOM4 main paper (3). For
these samples, the efficient knockdown was already confirmed by qRT-PCR
and western blotting. We observed an overall positive correlation for all
four TF knockdown samples across both platforms (Figure 1).
In general, deepCAGE fold changes were greater than those measured by microarrays,
as has been previously noted (19).
Figure 1. DeepCAGE and microarrays detect overall similar expression
changes.
Figure 1. DeepCAGE and microarrays detect overall similar expression changes.
The transcriptome-profiling technologies deepCAGE and microarrays showed overall similar transcriptional response (log2 expression fold-change) comparing before and after siRNA-based knockdown of the transcription factors IRF8, MYB, PU.1 and SP1. The Pearson correlation values for these two platforms are:
(a) 0.389 (P = 1.3e-12) for IRF8,
(b) 0.453 (P = 2.2e-16) for MYB,
(c) 0.450 (P = 1.2e-11) for PU.1 and
(d) 0.404 (P = 6.7e-10) for SP1.
Enrichment in the upregulated set of promoters suggests the
TF works as a repressor,whereas enrichment in the downregulated
set of promoters suggests the TF works as an activator. As an example,
we find that knockdown of IRF8, a known activator (20),
results in downregulation in both the deepCAGE and microarray experiments
of XAF1, a gene which we predict to contain our novel motif (Figure
2). The observation that MYB knockdown yielded motifs for both
up- and downregulated sets is consistent with its known role as both
a transcriptional activator and repressor (21). Despite
this, the motifs found in either set appear to be different, which may
suggest different modes or different co-factors for binding repressive
and activating sites.
Figure 2. DeepCAGE identified individual transcription starting
sites responding to transcription factor knockdown.
Figure 2. DeepCAGE identified individual transcription starting sites responding to transcription factor knockdown.
DeepCAGE profiling of the transcriptome quantitatively measures individual transcription starting sites (TSS) of capped mRNA indicated by the vertical bars (a) before and (b) after the knockdown of the IRF8 transcription factor. Red bars indicate CAGE tags that do not change upon knockdown while the black bars represent tags showing significant change upon knockdown. One transcript cluster (TC) is shown in the promoter region of the XAF1 gene on chromosome 17 (positions 6600047-6600115, hg18) together with the defining TSSs.
Figure 3. TFBS motifs derived for PU.1 and IRF8 as activators.
The 50 strongest downregulated TCs after knockdown of each of the two TFs PU.1 and IRF8 and their corresponding promoter regions were used as training data set to identify binding-site motifs and their respective PWMs (a and b). The PU.1 motif was present in 47 out of 50 sequences with an E-value of 4.6e-23 and is 20 nucleotides wide while the IRF8 motif was present in 20 out of 50 sequences with an E-value of 2.2e-9 and is 21 nts wide. The expression levels of deepCAGE TSSs containing the motif in their promoter sequences excluding the training data were contrasted to all other TSSs (c and d). The same comparisons were performed on promoter regions containing the TRANSFAC motif as well as for regions where the TFs bound to DNA according to ChIP-chip measurements. P-values were calculated using Student's t-test on microarrays values.
However, promoters containing PU.1 down or IRF8 down motifs were expressed at significantly lower levels than promoters lacking the motif. Moreover, when the same test was carried out using the published TRANSFAC (17,18) motifs for PU.1 and IRF8, or using ChIP data for PU.1 to identify PU.1-bound promoters, neither outperformed the novel motif (Figure 3). Furthermore, comparison to UCSC's vertebrate-conservation track revealed that 32.8 and 35.5% of the novel PU.1 and IRF8 base positions, respectively, are strictly conserved, while 11 out of 47 and 7 out of 20 PU.1 and IRF8 motifs are completely conserved. This compares with 3-8% average overall conservation and 11-24% conservation in coding regions.
In a parallel effort, we used the program CLOVER (22) to detect enriched motifs in the top 50 downregulated IRF8 and PU.1 CAGE clusters. As expected, we found enrichment for the corresponding known motifs in both data sets (for details see Supplementary Data and Figures S3 and S4 and Tables S2 and S3). However, the enriched motifs are only weakly overrepresented when considering all downregulated clusters. Therefore, the de novo derived motifs describe the transcriptional response to TF knockdown better than using known motifs or the present ChIP-chip data.
The motifs obtained for PU.1 and IRF8 were longer than the corresponding
motifs in the TRANSFAC database (Figure 4a). Manual alignment
of our matrices to each other and to the TRANSFAC motifs revealed that
both of our motifs contain regions similar to the TRANSFAC PU.1 and the
IRF8 motifs. Furthermore, we observed 44 promoters that were downregulated
in both IRF8 and PU.1 knockdown (Supplementary Figure S5
and Table S4). Our IRF8 motif contains
three triple-T (TTT) regions. To understand their significance,
we truncated our IRF8 motif by removing the triple-T sub-motif from either
end. The expression differences in the test set became less pronounced
(Figure 4b), indicating that all three triple-Ts are
important for the specificity. Similar examples of combinatorial regulation
were previously described for IRF8 and other IRF family members and for
the PU.1 transcription factor (20, 23).
Figure 4. Overlapping motifs for PU.1 and IRF8 transcription
factors.
Figure 4. Overlapping motifs for PU.1 and IRF8 transcription factors.
(a) The binding-site motifs we found for IRF8 and PU.1 were longer than the TRANSFAC motifs and both our motifs contained each of the TRANSFAC motifs as sub-motifs. Our motif for IRF8 was longer than the motifs of other IRF family members (data not shown).
(b) Trimming the characteristic TTT sub-motif from either side of the IRF8 motif reduced the ability of the motif to explain the changes in expression levels. P-values were calculated using Student's t-test.
A promoter-based gene regulatory network
Above we have demonstrated that KD followed by deepCAGE expression
profiling (KD-CAGE) can be effectively used to identify promoters
regulated by a given transcription factor. Moreover, highly downregulated
promoters in the PU.1 and IRF8 KDs were shown to contain PU.1 and IRF8
motifs indicating they are direct targets of these factors. The approach
can thus be used to directly generate a transcriptional network model (24).
For illustration purposes, we generated a small sub-network based on genes
co-perturbed by the knockdown of at least two of the four factors (Figure
5). Edges upregulated upon knockdown are shown
in red and those downregulated are shown
in blue. Genes co-regulated by PU.1 and IRF8 were predominantly
co-downregulated upon knockdown. Interestingly, there is an antagonistic
relationship for genes co-regulated by PU.1 and MYB, with the majority
downregulated upon PU.1 KD but upregulated upon MYB KD. The network predicts
47 genes as targets of our novel PU.1 motif. Eight of these (CD74, HCLS1,
NRGN, TNFSF13B, IFI6, MLC1, MARCH3 and CHI3L1) are supported by ChIP signal
for PU.1 (Supplementary Table S1).
Most of these are known to be important in hematopoietic lineages
and IFI6 is known to be an interferon-inducible gene. CHI3L1 has been previously
reported as a PU.1 target (25). However, this is the
first report that TNFSF13B, a myeloid-associated marker gene, is regulated
by both PU.1 and IRF8.
Figure 5. Network inferred from deepCAGE knockdown data.
Figure 5. Network inferred from deepCAGE knockdown data.
Our data can be transferred into network view using Cytoscape (24). The transcription factors represent the nodes and the promoters associated to their genes are the edges. Edges drawn in red indicate upregulation after TF knockdown while edges drawn in blue indicate downregulation. The dotted lines present edges that are detected by CAGE, while solid lines represent the edges that have a motif found by our method. For easier viewing, we have only shown those nodes from the training set that are influenced by more than one transcription factor.
These directed edges reflect the regulation of individual TSSs
rather than responses at the gene level and represents a powerful new approach
to building alternative promoter-aware networks in the near future.
DISCUSSION:
We have demonstrated for the first time that deepCAGE technology is a feasible alternative to microarrays for measuring RNAi-mediated perturbations and generating perturbation networks. As the technique is a direct measure of promoter expression, it allows focusing on the actual promoters used in a given cellular context, rather than ambiguous mapping of microarray expression to the 5'-ends of known transcripts. Furthermore, we have shown that our approach can be used to de novo identify regulatory motifs with a clear demonstration of functional motifs for PU.1 and IRF8 with similarity to the published TRANSFAC motifs. The motifs described by us perform better at describing the response to the KD than TRANSFAC and ChIP-chip data.
In the case of PU.1 and IRF8, many of the same promoters responded to either knockdown and a longer composite motif was identified. While the known IRF8 TFBS contains two copies of a triple-T motif, ours contains three copies. This longer motif, however, is functionally relevant as truncating the motif by removing the first or third triple-T reduced our ability to explain the transcriptional response to IRF8 knockdown. These observations are supported by the previously reported cooperative binding of both factors (20, 23). As the significant motifs were identified in the promoters of downregulated genes, we conclude that PU.1 and IRF8 in combination act primarily as activators as previously reported (22), while the motifs observed for MYB suggest it can act both as a repressor or an activator (Supplementary Figure 2A and B).
This pilot experiment paves the way for building regulatory networks and identifying regulatory motifs for the majority of transcription factors. Genome-wide ChIP of TFs is an alternative approach to identify transcriptional regulatory regions (26), which is extensively being used in the ENCODE project (4). However, to date only 160 ChIP grade antibodies are available for the estimated 882 DNA-binding transcription factors in mammals (27). KD-CAGE is not restricted by such reagents, and in the light of constantly reducing costs of DNA sequencing (28) it is possible to test a large collection of all DNA-binding proteins to characterize their function. In addition to the 330 regulatory interactions, we reported in our four knockdown experiments (Supplementary Table S1), only 3 were supported by current ChIP-chip experiments. This highlights that there are sites where the TF is bound but is functionally inactive, as noted by Wasserman and Sandelin (9). However, in spite of this, a combined approach would potentially be a very powerful method to discriminate indirect targets from direct targets bound by factors at both proximal and distal sites including enhancers and insulators.
Finally, we have previously described the application of motif activity
response analysis (MARA) in a developmental time course to predict
the regulation by TFs on individual promoters (3). However,
this approach depends on known TFBS motifs. The approach described
here can be used to identify TFBS motifs de novo. In the future,
we will aim to extend the set of known motifs using this approach and extend
our network analyses to encompass the function and targets of uncharacterized
DNA-binding proteins and to provide a network of interactions among such
proteins.
SUPPLEMENTARY DATA:
http://nar.oxfordjournals.org/content/38/22/8141/suppl/DC1
Supplementary Material:
Materials and methods:
Cell culture and knockdown experiments
THP-1 cell culture preparation, knockdown experiment procedures are described in detail in [1].
deepCAGE library generation, mapping and clustering of deepCAGE tags
CAGE libraries were constructed as described in supplementary material of [1].
De-novo transcription factor binding site prediction
We extracted the corresponding active deepCAGE promoter sequences from the human genome (hg18). The sequences were first masked for repeats with ‘N’s and then also for low information segments using the program Dust (Tatusov,R.L. and Lipman,D.J,. part of NCBI toolkit). Motif finding was performed by the program MEME (version 3.5.7) [2] searching for motifs on both strands, with the length of at least 4 nucleotides and an e-value cut off of 0.01, as suggested in [2] for finding biological relevant motifs. (command: meme filename -dna -nmotifs 20 -revcomp -evt 0.01 -minw 4). For scanning the MEME obtained motif across all sequences in our data set, we used the program Fimo from the Meta-MEME package using the p-value threshold of 1e-5 (command: fimo filename -motif –pthresh 1e-5 meme_file). We lowered the default value of 1e-6 since it was too stringent for our search and gave very few results (data not shown). We also evaluated our method by using TRANSFAC’s Match program to scan our sequences for the presence of TRANSFAC defined motifs with the ‘minimize false positives’ cut-off for the matrices. We used the ChIP-Chip data from [1] for PU.1 and SP1 transcription factors, data was selected with the standard deviation of 3.
CLOVER motif enrichment analysis
We performed motif enrichment analysis using CLOVER [3]. Firstly, we detected enriched motifs in the original top 50 down regulated CAGE clusters for both PU.1 and IRF8 and then searched for the enriched motifs in the remaining 3322 CAGE derived clusters (excluding the top 50 used for training). The results are presented in Supplementary tables 2 and 3 and Supplementary figures 4 and 5.
Clover uses known motifs from Jaspar database (used Jaspar 2009 version)
to scan for motif presence in the given dataset. This database does not
contain the IRF8 motif but it does contain motifs for IRF1 and IRF2 transcription
factors that closely resemble IRF8. For the PU.1 dataset, CLOVER found
overrepresented motifs for IRF1, IRF2, SPI1 (PU.1), SPIB, ELF5 and FEV,
while for the IRF8 dataset it found represented motifs to be IRF1, SPI1
(PU.1), IRF2, SPIB, ETS1 and ELF5. When comparing the CAGE derived clusters
that have these motifs to those that do not, for PU.1 we find significant
p-values for IRF1, IRF2, PU.1 and SPIB transcription factors while for
IRF8 we find significant p-values for IRF1 and SPIB transcription factors.
This analysis is consistent with our findings of both the PU.1 and IRF8
motifs individually and in combination in a number of overlapping clusters.
However, the obtained p-values are of lower significance than those
obtained by MEME and in both cases the clusters containing the motifs are
not representative for the down-regulated set (their median is not below
0). In conclusion these motifs do not explain down regulation better than
our longer overlapping motifs.
Supplemental References:
1. Suzuki, H., Forrest, A.R., van Nimwegen, E., Daub, C.O.,
Balwierz, P.J., Irvine, K.M., Lassmann, T., Ravasi, T., Hasegawa, Y., de
Hoon, M.J., et al. (2009) The transcriptional network that controls growth
arrest and differentiation in a human myeloid leukemia cell line. Nat Genet.,
41(5):553-62.
2. Bailey,T.L., Williams,N., Misleh,C. and Li,W.W. (2006) MEME:
discovering and analyzing DNA and protein sequence motifs. Nucleic Acids
Research., 34(Web Server issue): W369-73.
3. Frith, M.C., Fu, Y., Yu,L., Chen,J.F., Hansen, U. and Weng,Z.
(2004) Detection of functional DNA motifs via statistical over-representation.
Nucleic Acids Res.,32, 1372-1381.
Supplementary figure legends:
Figure supplementary 1. TFBS motifs derived for PU.1 and
IRF8 as activators. As a companion to Figure 3.
from the manuscript, we have also drawn our boxplots using only CAGE expression
values. There are no discernible differences.
Figure supplementary 2. Other obtained motifs. Apart from
the motifs for the PU.1 down-regulated and IRF8 down-regulated sets, we
also found motifs for the following data sets: MYB down-regulated
(a) present in 17 out of 50 sequences with an e-value of 3e-003
and the width of 29 nucleotides; MYB up-regulated (b) present in
38 out of 50 sequences with an e-value of 5.8e-016 and
width of 20 nucleotides; PU.1 up-regulated (c) present in 4 out of 38 sites
with an e-value of 3.8e-005 and 50 nucleotides wide; and SP1
down-regulated (d) present in 48 out of 50 sites, with an e-value of 2.7e-008
and width of 20 nucleotides. We checked these motifs for specificity in
the overall data set but the values obtained were not significant enough
to pursue further analysis. P-values were calculated using Student’s
test.
Figure supplementary 3. Position weight matrices for the
other obtained motifs.
Figure supplementary 4. Boxplots for motifs overrepresented
in the PU.1 down dataset. Each pair represents the sequences that have
the corresponding motif with their respective background.
Figure supplementary 5. Boxplots for motifs overrepresented
in the IRF8 down dataset. Each pair represents the sequences that have
the corresponding motif with their respective background.
Figure supplementary 6. The obtained motifs are specific
for their given data set. To make sure the obtained motifs were specific
for the transcription factor set the MEME searching was performed on, all
of the transcription factor sets were scanned with each motif. Interestingly,
when we scanned the sets with the PU.1 down-regulated motif, we found significant
values both for the PU.1 and IRF8 down-regulated sets, which imply that
these two motifs are somehow connected.
.....
Research Grant for RIKEN Omics Science Center from Ministry of Education,
Culture, Sports, Science and Technology (MEXT) (to Y.H.); International
Program Associate stipend from RIKEN (to M.V.). Funding for open access
charge: Research Grant for RIKEN Omics Science Center from Ministry of
Education, Culture, Sports, Science and Technology (MEXT) (to Y.H.).
Conflict of interest statement. None declared.
ACKNOWLEDGEMENTS:
Mr Akira Hasegawa assisted in the alignment of the deepCAGE tags.
P.C. developed the deepCAGE technology. T.L. conceived perturbation deepCAGE.
M.V. and T.L. designed the experiments and carried out the motif analyses
and network building. ARRF carried out the microarray analysis and Entrez
gene mapping. Y.T. and M.S. carried out the knockdowns. T.L., M.V., A.R.R.F.
and C.D. wrote the manuscript. H.S., Y.H. and C.D. advised on the experimental
design. All authors read and approved the final manuscript.
Footnotes:
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.
This is an Open Access article distributed under the terms of the
Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5),
which permits unrestricted non-commercial use, distribution, and reproduction
in any medium, provided the original work is properly cited.
REFERENCES:
1. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J,
Devon K, Dewar K, Doyle M, FitzHugh W, et al. Initial sequencing, analysis
of the human genome. Nature 2001;409:860-921.
2. Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda
N, Oyama R, Ravasi T, Lenhard B, Wells C, et al. The transcriptional landscape
of the mammalian genome. Science 2005;309:1559-1563.
3, Suzuki H, Forrest AR, van Nimwegen E, Daub CO, Balwierz PJ, Irvine
KM, Lassmann T, Ravasi T, Hasegawa Y, de Hoon MJ, et al. The transcriptional
network that controls growth arrest and differentiation in a human myeloid
leukemia cell line. Nat. Genet. 2009;41:553-562.
4. Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR,
Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, et al. Identification
and analysis of functional elements in 1% of the human genome by the ENCODE
pilot project. Nature 2007;447:799-816.
5. Quackenbush J. Extracting biology from high-dimensional biological
data. J. Exp. Biol. 2007;210 Pt 9:1507-1517.
6. Segal E, Shapira M, Regev A, Pe'er D, Botstein D, Koller D, Friedman
N. Module networks: identifying regulatory modules and their condition-specific
regulators from gene expression data. Nat. Genet. 2003;34:166-176.
7. Carninci P, Sandelin A, Lenhard B, Katayama S, Shimokawa K, Ponjavic
J, Semple CA, Taylor MS, Engstrom PG, Frith MC. Genome-wide analysis of
mammalian promoter architecture and evolution. Nat. Genet. 2006;38:626-635.
8. Tomaru Y, Simon C, Forrest AR, Miura H, Kubosaki A, Hayashizaki
Y, Suzuki M. Regulatory interdependence of myeloid transcription factors
revealed by Matrix RNAi analysis. Genome Biol. 2009;10:R121.
9. Wasserman WW, Sandelin A. Applied bioinformatics for the identification
of regulatory elements. Nat. Rev. Genet. 2004;5:276-287.
10. Sikder D, Kodadek T. Genomic studies of transcription factor-DNA
interactions. Curr. Opin. Chem. Biol. 2005;9:38-45.
11. Kodzius R, Kojima M, Nishiyori H, Nakamura M, Fukuda S, Tagami
M, Sasaki D, Imamura K, Kai C, Harbers M, et al. CAGE: Cap analysis of
gene expression. Nat. Methods. 2006;3:211-222.
12. Ponjavic J, Lenhard B, Kai C, Kawai J, Carninci P, Hayashizaki
Y, Sandelin A. Transcriptional and structural impact of TATA-initiation
site spacing in mammalian core promoters. Genome Biol. 2006;7:R78.
13. Valen E, Pascarella G, Chalk A, Maeda N, Kojima M, Kawazu C,
Murata M, Nishiyori H, Lazarevic D, Motti D, et al. Genome-wide detection
and analysis of hippocampus core promoters using DeepCAGE. Genome Res.
2009;19:255-265.
14. Tsuchiya S, Yamabe M, Yamaguchi Y, Kobayashi Y, Konno T, Tada
K. Establishment and characterization of a human acute monocytic leukemia
cell line (THP-1). Int. J. Cancer. 1980;26:171-176.
15. Bailey TL, Williams N, Misleh C, Li WW. MEME: discovering and
analyzing DNA and protein sequence motifs. Nucleic Acids Res. 2006;34 Web
Server issue:W369-W373.
16. Grundy WN, Bailey TL, Elkan CP, Baker ME. Meta-MEME: Motif-based
Hidden Markov Models of Biological Sequences. Comput. Appl. Biosci. 1997;13:397-406.
17. Matys V, Kel-Margoulis OV, Fricke E, Liebich IL, Barre-Dirrie
S, Reuter A, Chekmenev I, Krull D, Hornischer MK, et al. TRANSFAC®
and its module TRANSCompel®: transcriptional gene regulation
in eukaryotes. Nucleic Acids Res. 2006;34 Database issue:D108-D110.
18. Kel AE, Gossling E, Reuter I, Cheremushkin E, Kel-Margoulis
OV, Wingender E. MATCHTM: A tool for searching transcription factor binding
sites in DNA sequences. Nucleic Acids Res. 2003;31:3576-3579.
19. de Hoon M, Hayashizaki Y. Deep cap analysis gene expression
(CAGE): genome-wide identification of promoters, quantification of their
expression, and network inference. Biotechniques 2008;44:627-628, 630,
632.
20. Meraro D, Gleit-Kielmanowicz M, Hauser H, Levi BZ. IFN-stimulated
gene 15 is synergistically activated through interactions between the myelocyte/lymphocyte-specific
transcription factors, PU.1, IFN regulatory factor-8/IFN consensus sequence
binding protein, and IFN regulatory factor-4: characterization of a new
subtype of IFN-stimulated response element. J. Immunol. 2002;168:6224-6231.
21. Luscher B, Eisenman RN. New light on Myc and Myb. Part II. Myb.
Genes Dev. 1990;4:2235-2241.
22. Frith MC, Fu Y, Yu L, Chen JF, Hansen U, Weng Z. Detection of
functional DNA motifs via statistical over-representation. Nucleic Acids
Res. 2004;32:1372-1381.
23. Marecki S, Fenton MJ. PU.1/Interferon Regulatory Factor interactions:
mechanisms of transcriptional regulation. Cell Biochem. Biophys. 2000;33:127-148.
24, Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D,
Amin N, Schwikowski B, Ideker T. Cytoscape: a software environment for
integrated models of biomolecular interaction networks. Genome
Res. 2003;13:2498-2504.
25, Rehli M, Niller HH, Ammon C, Langmann S, Schwarzfischer L, Andreesen
R, Krause SW. Transcriptional regulation of CHI3L1, a marker gene for late
stages of macrophage differentiation. J. Biol. Chem. 2003;278:44058-44067.
26. Pillai S, Chellappan SP. ChIP on chip assays: genome-wide analysis
of transcription factor binding and histone modifications. Methods Mol.
Biol. 2009;523:341-366.
27. Fulton DL, Sundararajan S, Badis G, Hughes TR, Wasserman WW,
Roach JC, Sladek R. TFCat: the curated catalog of mouse and human transcription
factors. Genome Biol. 2009;10:R29.
28. Service RF. GENE SEQUENCING: the race for the $1000 Genome.
Science 2006;311:1544-1546.
This detailed analysis by Morana Vitezic, Timo Lassmann, Alistair
Forrest, Masanori Suzuki, Yasuhiro Tomaru, Jun Kawai, Piero Carninci, Harukazu
Suzuki, Yoshihide Hayashizaki, and Carsten Daub, reveals the genome-wide
individual gene transcription response to cell system perturbation by individual
siRNAs. By comparing each gene response to each promoter response, a tight
mapping can be obtained of the pathways utilized by each gene in each network,
and the entry and control points in each gene network can be determined.
Additional References:
1. Frenster JH, and Hovsepian JA,
"Models of
successive levels of resolution during individual gene transcription".
2. Frenster JH, and Hovsepian JA,
"Micro
RNAs and adult neoplasms of embryonic type".
3. Mishra PJ, and Merlino G,
"MicroRNA reexpression
as differentiation therapy in cancer".
4. Taulli R, Bersani F, Foglizzo V, Linari A, Vigna E, Ladanyi M,
Tuschl T, and Ponzetto C,
"The muscle-specific
microRNA miR-206 blocks human rhabdomyosarcoma growth in xenotransplanted
mice by promoting myogenic differentiation".
5. Frenster JH, and Hovsepian JA,
"Reprogramming
the human cancer cell nucleus".
1. Each cell retains all of its embryonic genes for a lifetime.
2. Controls for embryonic genes are often absent in adults.
3. Uncontrolled embryonic genes can replicate wildly.
4. Replicating genes participate in intra-cellular competition.
5. The basis for gene competition is selective transcription.
6. MicroRNAs can reprogram embryomic transcription.
7. Gene reprogramming can produce normal phenotypes.
8. Normal phenotypes can by-pass chromosomal lesions.
9. MicroRNA therapy may need to be permanent.
10. Transplantation of microRNAs could be preferred.
1. Pathways within cell genomes involve a flow of information.
2. Information can flow by direct contact or by third parties.
3. Direct contact within whole genomes is difficult to regulate.
4. DNA-DNA direct contects are influenced by agents.
5. Nuclear agents include hydrophilic ionic and hydrophobic conforming ligands.
6. Third parties within genomes involve RNAs and proteins.
7. RNAs and proteins are easy to regulate or reverse.
8. Information can be shared, lost, or transformed.
9. System information can be hidden during system isolation.
10. Local information can be permanently lost during system entropy.
http://www.cancerbiophysics.net/
Links to Current
Research in Euchromatin:
Links to
Euchromatin Activator RNA Reviews:
Links to
Euchromatin Activator RNA Research:
Links to Ultrastructural
Probes of DNase I-Sensitive Sites:
Links to
RNA as a Therapeutic Agent:
Links to Hodgkin Lymphoma
Immuno-Pathology:
Links to Activated
T-Lymphocyte Immunotherapy:
Links to Medical
Systems Biology:
Links to Selective
Gene Transcription:
Links to RNA-Induced
Epigenetics:
Links to RNA-Induced
Embryogenesis:
Links to RNA and
Biological Causality:
Links to Reprogramming
and Neoplasia:
A Brief History of Activator RNA:
"Ultrastructural
Probes of Active DNA Sites, and the RNA Activators of DNA".
(PowerPoint Presentation).
Top of Page - Euchromatin
Network - Euchromatin
Research - Research
in Quantitative Radiology
For Further Information and Feedback:
Jeannette A. Hovsepian, M.D.
E-mail: frensasc@ix.netcom.com
Phone: +1 650 367 6483