All users who want to use this tool should check following points at least:
See following sections to find out detail.
We have developed a novel tool that searches for cis-element candidates in the upstream, downstream, or coding regions of differentially regulated genes.
RiCES lists possible cis-element motifs corresponding to genes of interest, and it will contribute to the deeper understanding of gene regulatory mechanisms in plants.
The tool first accepts the list of genes that users interested in, and lists cis-element candidate motifs corresponding to the applied genes. The likelihood scores of the listed candidate motifs by association rule analysis. Finally remarkable cis-element candidates are selected and presented with some related information.

RiCES is a Web-application software. Users can operate RiCES by putting appropriate data to the form in http://hpc.irri.cgiar.org/tool/nias/ces. No special techniques are required.
RiCES assumes that a user has already identified genes of interest from experimental analysis (e.g. clusters of coordinately regulated genes). RiCES recognizes GenBank accession numbers, identifiers of transcription units (TUs) as defined in the TIGR pseudomolecular assemblies, and several other major gene identification systems (see another page for detail). Using the list, it retrieves the set of associated upstream, downstream, or coding region sequences flanking the specified genes from available genomic sequence data.
The second step of the analysis is the compilation of a list of motifs as candidate cis-elements by following methods:
The first method depends on ab initio motif searching based on the supposition that if there are cis-elements playing important roles in the regulation of a given set of genes, they will be statistically overrepresented in the associated promoter sequences as conserved motifs that can be identified by using a suitable motif search program. There are several programs implementing several algorithms. We have chosen to use MEME, which is a publicly available motif discovery program supporting an expectation maximization algorithm. In our analysis algorithm, MEME is invoked to identify motifs 6 to 8 bp long that look highly conserved among promoter sequences of the selected genes. Users can modify some of the search parameters of the MEME program via the Web form.
The second method relies on the hypothesis that common, known cis-elements play important roles under the experimental conditions that gave rise to the list of genes specified by the user. Therefore, RiCES searches for matches to a pre-compiled list of known cis-elements.
Precedent databases of plant cis-elements are not exhaustive enough to distinguish 'core' motifs, which decide the function of cis-elements, from co-existing sequences in neighboring regions. As a result, many cis-element sequence data in these databases include superficial core motifs for which no evidence of functionality has been obtained. The use of such data prohibit effective informatic analysis.
We compiled a novel database of known cis-elements and incorporated it into RiCES. The cis-elements are collected from reports of experiments such as gel shift assays and footprint analyses, categorized by transcription factor, and documented with respect to known activity in the plant genome. Some cis-elements known only in organisms other than plants are also listed, in consideration of their possible, albeit unknown, roles in plants.
The database includes four types of cis-elements:
Users can specify sequences of cis-element candidates that they pay attention, instead of using meme or precompiled-list. The candidate nuclear sequences should be inputed in the "Motif List" box in "Optional Items" section in the application form, where one line should include only one sequence. The sequences should be expressed in regular expression.
The third step of the analysis is the likelihood evaluation of the cis-element candidates by association rule analysis, which is a data mining method designed to discover significant relationships between pairs of characteristics observed in data sets. Candidates showing the highest likelihood (specificity) are retained in the final cis-element candidate list.
The strategy depends on the idea that motifs overrepresented in the promoter region of the genes of interest could play specific roles in regulation of the expression of those genes. Implied cause-and-effect relationships documented as 'rules' are evaluated by using several well-known indices of likelihood, including support, confidence, and lift. On the basis of sample data sets, the lift index appeared to best discriminate significant relationships between experimental conditions and cis-element candidates. The lift index appeared to best discriminate significant relationships between experimental conditions and cis-element candidates. We set the default threshold of lift to 1.0, and the cis-element candidates are included in the final candidate list only if their lift value is higher than this threshold.
The final cis-element candidate list is presented as an association table with the identifier of the submitted genes (TU identifiers based on TIGR gene model annotation are used in the current version) annotated with any available corresponding information from RiceCyc (http://www.gramene.org/pathway/) and Gene Ontology. RiCES also provides information on candidate motifs, including the positions of the element in the promoter regions of corresponding TUs, the sequence, and related information from AtcisDB. The position of the cis-element candidates is also presented in both text and graphics.
Association rule analysis is based on simple arithmetic methods. The analysis is start from a ratio of the number of genes possessing and not possessing cirtain cis-elements in their prometer regions.
Although association rule analysis is simple and effective approach, we should note that this method tends to show false-positive results when the number of user-defined target genes is much smaller than that of reference gene list. In default, reference gene list is the whole available TUs stores in KOME database, which include up to 30,000 genes.
Users can try association rule analysis with smaller reference genes which are selected from the default reference gene set with certain conditions, such as possession of a known cis-element motif in upstream region. We have prepared several such reference sets, which can be selected easily by users in the form of the starting page.
The size of reference gene list can be reduced by selecting 'random selection' option.
The sequences of pre-compiled cis-element lists are stored in perl-compatible regular expression, to represent ambiguous sequence patterns. Results are also presented using this expression. Users should follow the syntax of this expression to define "user-defined candidate list".
Several examples:
See http://perldoc.perl.org/perlrequick.html to find out more about regular expression.
| Helix-turn-helix(HTH) | (CTAATTG){2,3} |
| BBR/BPC | ((GA)+|(TC)+) |
| RAV | CAACA[ACGT]*CACCTG |
We also tried to evaluate pairwise combinations of motifs in the preliminary candidate list, in consideration for possible protein-protein interactions of multiple transcription elements binding cis-elements, as previously illustrated by experimental evidence (Ulmasov et al., 1995; Ulmasov et al., 1999).
Currently this is achieved by generate "combined cis-element candidate list" from the preliminary candidate list. For example if the preliminary list consists of three candidates, "AACC", "ATAT", and "GCAT", then "AACC.*ATAT", "AACC.*GCAT", "ATAT.*GCAT", "ATAT.*AACC" etc. will be included in the combined cis-element candidate list. Complementary sequences are also took in consideration, and thus motifs such as "GGTT.*ATGC" will also be added to the combined list. The members of this list will be evaluated just like members of the original candidate list.
Users can try such analysis by clicking on "Yes" for "Do Combined motif analysis" option. Note that it would be quite time-consuming.
Last Update: 11 July, 2007