Evola (Evolutionary annotation database, http://www.h-invitational.jp/evola/)
is a sub-database of H-InvDB providing ortholog information among
vertebrates for the human genes (representative transcripts, one
transcript per gene, 2.3.2).
Orthologs are genes in different species that separated from a common
ancestor by speciation.
(Tree icon) is a gateway to Evola. For example, if users click the icon in the Transcript view (4.1) showing a gene, Evola main page showing its orthologs will appear.
This help page (4.10) explains the database contents and how to use
Evola. Information about the analysis pipeline of orthologs and gene
families is described in 2.5 Evolutionary annotation.
4.10.0 Menu
- 4.10.1 Top/Search page
- - 4.10.1a Top/Search
- - 4.10.1b Search result
- 4.10.2 Main page
- - 4.10.2a Alignment
- - 4.10.2b Locus maps
- 4.10.3 Gene family browser
- Currently not in service
Evola introduction, release information, statistics (Numbers of orthologs and UCSC
genome version) are shown. Search is available here in this Top/Search
page. This page is provided in both English and Japanese.
Figure 4.10.1a Top/Search
-
Common header
This part always appears in Evola. Icons of H-InvDB and Evola at the
upper-left corner are links to jump to their top pages. Link to the
H-InvDB Advanced Search is at the upper-right corner. In the black band
segment (Evola menu), there are links to "Home" (Evola Top/Search page),
"Data download" (Download page), "Contact us" (Mail form) and "Help"
(this page).
-
Download: In the download page, ortholog list, their nucleotide and
amino acid sequences, phylogenetic trees, dN/dS ratios between the orthologue pair are available.
-
Search bar
Orthologs can be searched by three ways listed below. Users can specify "Species" and "Annotation status" (2.5) in the option menu.
-
Keyword: Gene names (Definition) of H-InvDB human representative transcripts (2.3.2)
-
Accession number: Accession numbers of all transcripts of human and other vertebrates (H-InvDB HIT and HIX, DDBJ, Ensembl, RefSeq)
-
Gene symbol: Gene symbols of HGNC for H-InvDB human representative transcripts (2.3.2)
-
Related sites
Database of
Comparative genome browser: "G-compass"
is
also open to public for free of charge. Related dataset of Evola are provided.
Figure 4.10.1b Search result
-
Page control
Search results will appear in tables of the human genes (Five results for the human representative transcripts (2.3.2) per page). Show other pages (if available) by clicking prev or next buttons.
-
Human gene table
Upper half (light blue) of each table contains information of the human representative transcript (2.3.2) and link button to Main page (4.10.2). Search words ("lung cancer" in Fig. 4.10.1b) are highlighted in red. Lower half (light yellow) show the presence of the vertebrate orthologs. Click (if the orthologs available) to show accession numbers of orthologs.
Ortholog information is shown in the left frame. "Alignment" and "Locus maps" button switch the contents in the right frame.
Figure 4.10.2 Main page
-
Human gene annotation and the content-switch buttons in the right frame
Human gene annotations (accession numbers, gene name, amino acid length,
etc.) are automatically obtained through H-InvDB Web service (3.4). Buttons of and are to change the content of the right frame.
-
Evolutionary annotations and sequences and molecular phylogenetic information
Links to gene family browser, ortholog list, etc. are shown. Sequences and phylogenetic trees are available for download. Click to close each content.
-
Ortholog: Ortholog(s) of the vertebrates (Details are described in 2.5 Evolutionary annotation)
-
Computational analysis: Orthologs identified by the method based on similarity of genomic and amino acid sequences.
-
Manual curation: Orthologs curated with molecular phylogenitc approach through
the comparison between the species tree and the gene tree.
-
Similarity (%): Calculated by "Numbers of matched nucleotide / length of
pairwise alignment (aa)" (internal stop codon and frameshift are
skipped)
-
Alignment length (aa): Length of pairwise alignment with human protein (amino acid) sequence by FASTA (without stop codon)
-
Similarity search: Results of homology search querying the human protein to known protein databases (for reference)
-
Data download: Download and view of the sequence and the
phylogenetic tree. Short and truncated sequences, which largely reduce the unambigous alignment sites subject to tree inference, are removed from the alignment by MaxAlign.
-
Sequence: Transcript (RNA) and protein (amino acid) sequences of
the orthologs in FASTA format (header information like ">BC042669_Mus. ddbj"
denotes "accession number"_"genus"."sequence distributer (hinv, ddbj,
ensembl, refseq)")
-
Gene family tree: Phylogenetic tree of an ortholog group or a gene family in Newick tree format. The tree is inferred by Weighted Neighbor Joining (Weighbor) assuming WAG+Γ-F. (Use NJplot or TreeView to show the tree.)
-
View by ATV: Phylogenetic tree drawing by ATV | A Tree Viewer in the web browser (JAVA installation is needed.)
-
Right frame (Alignment / Locus maps)
The contens in the right frame, Alignment (4.10.2a) and Locus maps (4.10.2b), are switched by the button in the light frame.
Figure 4.10.2a Alignment
-
Alignment of orthologs and Alignment of gene family
Amino acid sequence alignment of human and the vertebrate orthologs
is shown. The alignment of the gene family including the ortholog is shown, if the human
gene has paralog(s) and the number of the members of the family is 50 or less.
Alignment of gene family/group and Alignment of orthologs
at the upper-right corner are links to each other.
These alignments were constructed by
MAFFT
and arranged in order of the lineage, from human to fish.
-
Popup of species name: Placing the mouse cursor on sequence names (accession numbers and genus) to show its common name (for example, chimpanzee in Fig. 4.10.2a)
-
"." and "_" at the end of sequence names: Denoting ortholog (".") and paralog ("_") in the gene family alignment
-
[View] link of human paralogs: Jump to Evola Main page of the human gene
-
Amino acid abbreviation table
Click the link of Amino acid abbreviation to open a new window of the table. Colors of amino acid residues are based on ClustalX default colors.
Figure 4.10.2b Locus maps
-
Figure legend
Figure legend of gene locus maps.
-
Representative transcript: Human representative transcript (2.3.2) or other vertebrates' representative transcript
-
Representative alternative splicing variant (RASV): Human representative alternative splicing variant (2.2) (RASV includes representative transcript)
-
Untranscribed region (UTR) Depicted for human only
-
Locus maps
All transcripts in the orthologous gene locus are shown as well as the
orthologous transcript. Comparison of transcript variants (consisting of
different exon structures in a locus) between species is available.
Clicking to close each locus for convenience in comparison of distant species (For example, human and fish).
-
Locus information: Species name (common name and genus), human locus ID (HIX), Gene symbol (human), genomic location of the locus
-
G-integra: Link to G-integra genome browser (4.3)
-
Locus map: Exon structure of all transcripts in the locus (showing ~50 sequences)
-
Transcript types and colors: Transcripts are colored as light
green for DDBJ (H-Inv full-length cDNA), green for DDBJ (Other RNA),
blue for RefSeq and orange for Ensembl (Darker color denotes higher
identity to genome) (2.0 Annotation items in H-InvDB or 4.3 G-integra)
-
Transcript list
All transcripts with the link to original site are listed. Transcript in
red is the ortholog (representative transcript or RASV). Both HIT ID
and accession number is listed for human transcripts.
Revised: December, 2010