Motif Distribution Viewer              Download    Help

Overview

MDV is a Web-based tool for searching for localized promoter motifs by visualizing the distribution of various motifs on a user-defined set of promoter sequences. The user selects the promoter set and restriction list of sequence IDs, the motif sets (by either PWMs or text patterns), and the target regions in the promoters. The target regions are specified in terms of the relative base positions from TSS. The user selects a motif or motif set, represented by either PWMs (typically for known motifs) or text patterns using International Union of Biochemistry (IUB) codes. The user can also specify sequence patterns to be excluded from the search. The parameters employed in constructing the distributions or histograms can be refined by the user. The tool also provides a unique two-dimensional display that enables an overview of the distribution of multiple motifs.

MDV enables the user to obtain the distribution of the motif positions on all of the sequences (or the user-defined set) in the selected promoter set for the motifs specified by the user. In the following steps, we demonstrate how MDV is used via a Web-based interface.

Step 1: Select a sequence set

Figure 1(a) shows the main page of MDV. The user selects a sequence (gene/promoter) set from the list. We currently provide the human gene set from H-InvDB release 5.0 (http://h-invitational.jp/) and those of human and mouse from DBTSS ver 5.2 (http://dbtss.hgc.jp/). The user can choose to restrict the gene set based on biological knowledge or experiments such as microarrays by inputting a list of IDs in the textbox. For the H-InvDB gene set, it is necessary to enter HIX IDs (H-InvDB cluster IDs) delimited by spaces or line feed codes to specify the gene set to be analyzed, while for the DBTSS gene set it is necessary to enter RefSeq IDs such as "NM_*".

Step 2: Set parameters for search and display (optional)

To manipulate the layout of the resulting panels, the user can specify the search range of the region. The sequences are coordinated within base positions ranging from -1000 to +499, including a zero position that represents the transcription start site. The term "width" represents the number of pixels with motif Type B (PWM-2D profile). The motif type is described in Step 3. The pixel height can be adjusted only when the user selects "Type B" analysis in the following step.

Step 3: Select a motif-view type

A motif is a group of different DNA sequences on promoters that are important in the transcription of mRNA; most promoter motifs are presumed to be bounded by transcription factors. A motif is usually represented in one of two ways: (i) a position weight matrix (PWM) or Position-Specific Scoring Matrix (PSSM), or (ii) a text pattern using IUB codes, in which each character indicates a single base or plural possible bases of DNA of the same position. We use the PWM dataset from JASPAR database (http://jaspar.genereg.net/). For example, the IUB code "Y" corresponds to either cytosine (C) or thymine (T) (see http://biocorp.ca/IUB.php for a full list of codes). The user selects one of the following four types that specify the motif representation and view: A) PWM-histogram, B) PWM-2Dprofile, C) IUB-histogram, or D) IUB-2Dprofile. Each of these types is summarized below.

Type A

Type A (PWM-histogram) enables the user to display the positional distribution of a single motif of a PWM on the selected gene set via a histogram (Figure 2).
Histgrum example

Type B

Type B (PWM-2Dprofile) enables the user to display the positional distribution of multiple motifs of PWMs on the selected gene set via a 2D profile (Figure 3). By clicking the motif name in the 2D profile, the user can display the histogram of the motif (as for Type A). The user can choose to display either one or both strands of the motifs.
2D-profile example

Type C

Type C (IUB-histogram) enables the user to display the positional distribution of a single motif of IUB codes on the selected gene set via a histogram. This type alone provides the user with the option to specify sequences to be excluded from the search. This constitutes a novel representation of DNA motifs in specifying a group of various DNA sequences or a motif instance group. Neither the patterns in IUB codes nor PWMs are able to represent this type of instance group. The user can therefore employ this function to refine the definition of the motifs of interest.

Type D

Type D (IUB-2Dprofile) enables the user to display the positional distribution of multiple motifs of IUB codes on the selected gene set via a 2D profile. The number of motifs is restricted to 10 when using the JBIRC Website. The user can choose to display either one or both strands of the motifs, as with Type B.

Step 4: Retrieval of sequence, motif information

In this step, the user can obtain further information such as the sequence and positions within which the motif is found, sequence IDs, and strands of the motif. The user can set the conditions of the motifs to be obtained by specifying the region and strands that typically show a peak. The obtained sequence set can then be analyzed using different motif-finding tools, with the possibility that novel motifs might be discovered in the set as a cooperative motif with those motifs analyzed using MDV.

Miscellaneous

MDV is freely available at http://h-invitational.jp/mdv/ and open to all users and there is no login requirement. Contact: mdv-srv_help+at+m.aist.go.jp@@(please replace "+at+" with "@").