[ Japanese ]

2.2 Annotation policies: Alternative splicing variants

2.2 Alternative splicing variants

Alternative splicing (AS) variants and their AS pattern in each locus were automatically determined by computational pipeline. This was first established during H-Invitational (H-Inv) meeting and was applied to all human gene data set. The algorithm was designed as follows; First, exact location of the exon-intron boundaries on the genomic sequences were determined using the well-aligned exons. Then the location of each boundary was compared with those of the others belonging to the same cluster. For the comparison, 10 bp allowance was made. Regarding the "5'/3'-end AS", where a transcript has a piece of its first or last exon inside confirmed intron of one or more of the others, it is regarded as containing an AS. Regarding the "internal AS", where a transcript has a piece of its internal exon inside a confirmed intron of one or more of the others, it is recognized as containing an AS (Fig. 2.2.1). The pattern of these AS variants was classified into five typical types simultaneously. They are termed "cassette (skipped exon)", "internal acceptor (alternative 3' splice) site", "internal donor (alternative 5' splice) site", "mutually exclusive" and "retained intron" as in Fig. 2.2.2.


Fig. 2.2.1 Scheme of AS identification and classification of AS location


Fig. 2.2.2 Classification of AS pattern

Because some AS variants construct same gene structure in the locus, the representative AS variant (RASV) was selected among them. By using RASVs, ASs affecting protein functions such as protein motif, GO, subcellular localization signal and transmembrane domain were examined through the entire human genome. The result was published by recent papers and the data was registered into H-DBAS at http://h-invitational.jp/h-dbas/, which is the one of H-InvDB satellite databases.

References
Takeda, J. et al. (2007) H-DBAS: Alternative splicing database of completely sequenced and manually annotated full-length cDNAs based on H-Invitational. Nucleic Acids Research 35 (Database issue), D104-D109

Takeda, J. et al. (2006) Large-scale identification and characterization of alternative splicing variants of human gene transcripts using 56,419 completely sequenced and manually annotated full-length cDNAs. Nucleic Acids Research 34 (14), 3917-3928

Revised: January 30, 2008