[ Japanese ]

4.5 H-ANGEL (the Human Anatomic Gene Expression Library)

The Human Anatomic Gene Expression Library (H-ANGEL) is a resource which provides information on human gene expression. H-ANGEL displays expression patterns of transcriptional products generated by the H-Invitational project in practical tissue categories based on tissue-specific expression data from several experimental platforms. H-ANGEL also displays information for the expression of different genes at their corresponding physical positions in the human genome. This information is linked to the corresponding transcript or locus annotation data stored in H-InvDB. These integrations enable us to view a cross-platform comparison of the expression data. The data within H-ANGEL has been updated along with each new release of transcripts or genome sequence build. The expression data held in H-ANGEL was generated using three types of methods from seven different platforms. These included iAFLP1,2, PCR-based quantitative expression profiling, long oligomers cDNA arrays, short oligomers cDNA arrays 3,4, glass slide cDNA microarrays5,6,7, SAGE8,9, EST10,11 and MPSS12.
We also provide the downloadable expression data of H-ANGEL from the following URL: http://h-invitational.jp/hinv/dataset/download.html.
File name: H-ANGEL matrix.

4.5.1 Access to the H-ANGEL database

1) Through H-InvDB

In order to access the expression pattern for a particular entry from within H-InvDB simply click on either of the H-ANGEL icons shown below in the home page of H-InvDB, or Locus or cDNA view to open a new browser window displaying the information.

H-ANGEL icon in H-InvDB Locus or cDNA view

2) Direct access to the 'Home Page' of H-ANGEL

The users may access the 'Home Page' of the H-ANGEL database from the following URL: http://h-invitational.jp/hinv/h-angel/.

4.5.2 Overview of the H-ANGEL viewers

The main view of H-ANGEL is composed of the following sites:

1) Home Page: This page provides access to two different kinds of expression data search for users; 'H-Inv Locus Search for Gene Expression' and 'Expression Pattern Search'.

2) Expression Pattern View: This section of H-ANGEL displays an overview of all expression data stored in H-ANGEL according to classified tissue categories. The user can access this site from 'H-Inv Locus Search for Gene Expression' from the 'Home Page'.

3) Expression Pattern Search View: The user can access this section by clicking on the 'Go to Pattern Search' button from the 'Home Page', and perform similarity searches based on user defined expression patterns to retrieve lists of loci which have a similar expression pattern to that of the query.

4.5.3 H-ANGEL Sections

Home Page

The H-ANGEL 'Home Page' provides two search options (Fig 4.5.1). These are:

1) H-Inv Locus Search for Gene Expression

This section allows the users to access the H-Inv predicted loci that include expression data related to the identifier or keyword used for the search. First, the user specifies a type of identifier or keyword from the pull down menu, enters a query string in the lower box and presses the 'Submit' button.
The results web page will then appear displaying expression data for the requested H-Inv Locus ID in 'Expression pattern view'. The following identifiers and keywordscan be use for this kind of search: H-Inv Locus ID, RefSeq / FLcDNA accession number of DDBJ/GenBank/EMBL International Nucleotide Sequence Database (INSD), UniGene ID, Locus Link ID and keyword in the Definition line and product name from INSD entries of RefSeq / FLcDNA.

2) Expression Pattern Search

The user can access 'Expression Pattern Search View' by clicking on the 'Go to Pattern Search' button. Using this interface the user can define a specific expression pattern across ten supra-tissue categories and search all loci contained within H-ANGEL that match the defined pattern across all entries for the specified platform(s). In this similarity search based on expression pattern, the users can select one from two correlation coefficients (Pearson' s and Cosine) and also specify the platform(s).


Fig 4.3.1 Home Page of H-ANGEL
Expression Pattern View

This section displays an expression pattern histogram for classified tissue categories (Fig 4.5.2). In this section, there are 7 different kinds of expression data resources integrated into the view. The resources are: 1) iAFLP(JBIRC): iAFLP experiments conducted at JBIRC 2) iAFLP(Osaka Univ.): iAFLP experiments from the BodyMap project at Osaka University 3) Long oligomer array: Long oligomer chip experiments conducted at JBIRC (proprietary data) 4) GeneChip array U95Av2 (HuGEIndex): Affymetrix GeneChipTM experiments conducted at the Harvard Institute of Medicine (HuGEIndex) 5) GeneChip array U95Av2 (LSBM): Affymetrix GeneChipTM experiments conducted at RCAST, The University of Tokyo 6) cDNA microarray (CNRS):Glass slide microarray conducted at CNRS 7) SAGE: Serial Analysis of Gene Expression data from Ludwig Institute for Cancer Research 8) BodyMap-EST: Count of Sequence Tags of normal adult tissues on dbEST at NCBI and BodyMap project at Osaka University 9) MPSS: Massively Parallel Signature Sequencing data from Kyushu University (proprietary data). In the Information of Genes column, H-Inv genes mapped on the locus are displayed with the exon / intron structure information. The definition of genes and their absolute expression level measured by SAGE are also displayed in this area.


Fig. 4.5.2 Expression Pattern View Section

All of the samples from each resource were mapped into the third level of the tissue category classification (Fig. 4.5.3), and the associated expression values converted into comparable values for each tissue category. A table cross referencing the tissue category originally assigned for each dataset by its provider and the corresponding tissue that was category allocated by H-Inv consortium members can be viewed by clicking on the 'Definitions for Tissue Categories' in the 'Expression Pattern View' section.


Fig. 4.5.3 Three Level Tissue Categories


Fig. 4.5.4 the normalization method used for the expression data contained in H-ANGEL

We normalized the relative values of expression in order to make the sum total of expression across 40 tissues equal to 1 (Fig.4.5.4). Among the datasets used there is no one platform which covered all 40 tissue categories except for EST and BodyMap-EST. The Spartan distribution of expression data among some of the 40 tissues often makes direct comparison between tissues and across multiple platforms difficult. Because of the inherent difficulties associated with direct comparison across the 40 tissue categories we decided to create 10 supra-categories-groups representing logically related tissues.
For each of these 10 tissue categories, we calculated the average sum of expression for the related tissues in that group. In the above example we show the averaged ratio of gene expression level for placenta, testis, ovary and prostate. In cases where expression data was unavailable for a tissue type included in a supra-category (NaN [Not a number]) we did not include that tissue in the final calculation. As you can see in the above example even though the reproductive tissue supra group contains 4 tissues no expression data analysis was performed for testis so the total sum of expression is divided by 3 not 4.
In the 'Expression Pattern of Genes', detection points in the locus on each experiment are shown on the left side of the view. The detection points are indicated by short vertical lines when the exact positions are known. When the position is unknown it is designated by a blue square. The detection point data is associated with the gene structure information in the 'Information of Genes' column. The makes it easier for the user to locate the position of the tag or probe on the gene and examine which exon was profiled in the experiment. Histograms of the expression patterns are displayed for both the 10 and 40 tissue categories.
A sub section of the 'Expression Information' section (Fig. 4.5.5) displays any publicly available information relating to the clones from the locus in text format. This section appears when an H-Inv Locus entry is selected for display. The 'iAFLP information Box' contains information on the conditions under which gene expression was measured by the iAFLP experiment for each tissue. The 'UniGene information Box' contains information on tissues in which clone(s) from the UniGene cluster corresponding to the locus were reported. However, when multiple loci are displayed, this functionality does not support.>


Fig. 4.5.5 Expression Information in Text

Expression Pattern Search View

This section allows users to retrieve all expression pattern matches for a pattern of interest (Fig. 4.5.6). Using the 'Expression Pattern Specification' box, the user may define an expression pattern of interest by adjusting the height of the bar representing expression level in each tissue supra-class. The user may also define the desired correlation coefficient in the 'Correlation Specification' box. The user may extract results from specific platforms by highlighting the boxes in the 'Platform Selection' box. By clicking the 'Search' button; the list of expression data will be shown in 'Expression Pattern Search Result' (Fig. 4.5.7).


Fig. 4.5.6 the Expression Pattern Search Box

The 'Expression Pattern Search Result' section is where the resultant list of expression patterns produced by the pattern defined in the 'Expression Pattern Search Box' (Fig. 4.5.6) is displayed. The 'Expression Pattern Search Result' is where H-Inv Cluster ID, Accession No, correlation value, experiment type and expression pattern across the 10 supra-categories is displayed. The user can open the 'Expression Pattern View' page corresponding to the locus of interest by clicking on the H-Inv cluster ID.


Fig. 4.5.7 an example of the "Expression Pattern Search Result"
Note:

In the current version, however, all the platforms in H-ANGEL do not cover the 10 supra-categories. CNRS and MPSS data lack some categories. When a platform does not cover the categories, no answer will be replied to the queries in this search.

4.5.4 Summary of all the Expression Data Sources used by H-ANGEL.(H-InvDB_3.0)

Methods Platforms Technologies Institutes No. of H-Inv loci
PCR-based quantitative expression profiling iAFLP Introduced Amplified Fragment Length Polymorphism JBIRC(Kousaku Okubo) 11,123
Osaka Univ(Kousaku Okubo) 7,755
DNA arrays Long oligomers 80 nucleotide length oligomer chip JBIRC(Shinya Watanabe) 12,730
Short oligomers Affymetrix GeneChipTM Boston Univ.(HugeIndex) 2,725
Tokyo Univ.(Aburatani) 18,183
cDNA microarray cDNA nylon microarrays and cDNA glass microarray CNRS(Charles Auffray) 6,645
cDNA sequence tags SAGE Serial Analysis of Gene Expression Ludwig Institute for Cancer Research 17,480
EST + BodyMap Expressed Sequence Tags NCBI 26,946
3'-directed cDNA library BodyMap
MPSS Massively Parallel Signature Sequencing NIG(Kousaku Okubo) 20,193

Reference

  1. Kawamoto, S., Ohnishi, T., Kita, H., Chisaka, O. & Okubo, K. Expression profiling by iAFLP: A PCR-based method for genome-wide gene expression profiling. Genome Res 9, 1305-12. (1999).
  2. Sese, J. et al. BodyMap incorporated PCR-based expression profiling data and a gene ranking system. Nucleic Acids Res 29, 156-158 (2001).
  3. Haverty, P. M. et al. HugeIndex: a database with visualization tools for high-density oligonucleotide array data from normal human tissues. Nucleic Acids Res 30, 214-217 (2002).
  4. Ge, X., Yamamoto, S., Tsutsumi, S. & Aburatani, H. Comprehensive gene expression database of normal human tissues using oligonucleotide microarray. Submitted
  5. Pietu, G. et al. The Genexpress IMAGE knowledge base of the human brain transcriptome: a prototype integrated resource for functional and computational genomics. Genome Res. 9, 195-209 (1999).
  6. Pietu, G. et al. The Genexpress IMAGE knowledge base of the human brain transcriptome: a resource of structural, functional, and positional candidate genes for muscle physiology and pathologies. Genome Res. 9, 1313-20 (1999).
  7. Array s/IMAGE : http://www.vjf.cnrs.fr/FRE2571, personal communication, unpublished. Note : the Array s/IMAGE (CNRS FRE2571, Genexpress, France)
  8. Velculescu, V. E., Zhang, L., Vogelstein, B. & Kinzler, K. W. Serial analysis of gene expression. Science 270, 484-487 (1995).
  9. Boon, K. et al. An anatomy of normal and malignant gene expression. 99 17, 11287-11292 (2002).
  10. Boguski, M. S., Lowe, T. M. J. & Tolstoshev, C. M. dbEST: database for Expressed Sequence Tags. Nat Genet 4, 332-333 (1993).
  11. Kawamoto, S. et al. BodyMap: a collection of 3' ESTs for analysis of human gene expression information. Genome Res 10, 1817-1827 (2000).
  12. Brenner, S. et al. Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat Biotechnol 18, 630-4. (2000).
Revised: September 12, 2007