[ Japanese ]

5.0 FF format

5.1 JBIRC format
5.2 Locus information
5.3 cDNA information

5.1 JBIRC format

The general form of the earch line is;

    [term_name]:[term_value]
    e.g., CDNA_ACCESSION-NO: AK000008

Contents of the molecular evolutionary analysis data are as follows:

Acc#.st.html
The topological distance between NJ and NJML+ trees is shown. Information about the log-likelihood values of the examined tree topologies is provided.

Acc#.sns
The numbers of synonymous and nonsynonymous substitutions and other information in the entire alignment are shown in the second line:

1: number of synonymous substitutions per site
2: number of nonsynonymous substitutions per site
3: ratio of nonsynonymous to synonymous substitutions
4: Z-statistics of the difference between the synonymous and nonsynonymous substitutions
5: number of synonymous sites
6: number of nonsynonymous sites
7: number of synonymous substitutions
8: number of nonsynonymous substitutions
9: number of codons examined over the total number of codons including gaps
10: ratio of transition to transversion
11: type of genetic code

Results of window analysis along the alignment are shown after the fourth line.

1: position of codons examined; start plus range
2: number of nonsynonymous substitutions
3: number of identical nonsynonymous sites
(2 + 3 = number of nonsynonymous sites)
4: number of synonymous substitutions
5: number of identical synonymous site
(4 + 5 = number of synonymous sites)
6: probability by Fisher's exact test (two-tailed)
7: statistical significance

    - negative selection at 5%
    -- negative selection at 1%
    --- negative selection at 0.1%
    + positive selection at 5%
    ++ positive selection at 1%
    +++ positive selection at 0.1%

Utility Formats

The format of the evolutionary analysis files is as follows:
Acc#.aln was created by Clustal W.
Acc#.fl.nj.phb and Acc#.fl.njml.phb are Newick format files. The filescan be read by phylogenetic tree viewers such as TreeView.

5.2 Locus information

Domain_name: Description
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CLUSTER_CLUSTER-ID: H-Inv cluster ID of this cluster assigned by H-Invitational database.
CLUSTER_CLUSTER-ID-VERSION: H-Inv cluster version ID of this cluster assigned by H-Invitational database.
CLUSTER_REP-H-INVITATIONAL-ID: H-Inv ID of representative curated cDNA in this H-Inv cluster.
CLUSTER_REP-H-INVITATIONAL-ID-VERSION: H-Inv version ID of representative curated cDNA in this H-Inv cluster.
CLUSTER_REP-ACCESSION-NO: DNA database accession number of representative curated cDNA in the H-Inv cluster.
CLUSTER_REP-ACCESSION-NO-VERSION: DNA database accession number plus a version number after the decimal point of representative curated cDNA in the H-Inv cluster.
CLUSTER_REP-DEFINITION: Functional definition of this H-Inv cluster, i.e. the functional definition of the representative transcript.
CLUSTER_GENOME-STATISTICS: Version information of the genome sequence which the release is based on.
CLUSTER_DATE-CREATED: Date of creation of this H-Inv cluster entry in H-Inv database.
CLUSTER_DATE-LAST-UPDATE: Date of last-updated of this H-inv cluster entry
CLUSTER_DATE-LAST-MODIFIED: Date of last-modified of this H-inv cluster entry
CLUSTER_H-INV-DB-RELEASE: The release number of H-InvDB
CLUSTER_HIX-RELEASE: The release number of H-InvDB HIX annotation.
CLUSTER_DATE-ANNOTATION-LAST-UPDATE: Date of last-updated for H-InvDB locus annotation.
CLUSTER_DATE-ANNOTATION-LAST-MODIFIED: Date of last-modified for H-InvDB locus annotation.
CLUSTER_HIX-ANNOTATION-RELEASE: Date of first-release for H-InvDB locus annotation.
CLUSTER_GENE-NAME_GENEW: HUGO approved gene symbol of the cluster.
CLUSTER_CHROMOSOME-NUMBER: Chromosomal band where this H-Inv cluster mapped.
CLUSTER_BAND_CHROMOSOME-BAND: Chromosomal number where H-Inv cluster mapped.
CLUSTER_START: Start position of the location on the chromosome where this H-Inv cluster mapped
CLUSTER_END: End position of the location on the chromosome where this H-Inv cluster mapped
CLUSTER_STRAND: Strand to which H-Inv cluster mapped on the genome; forward or reverse.
CLUSTER_UNORDERED-REGION_RANDOM-FLAG: Flag for if the cDNA/cluster is mapped on unordered part(random) of the chromosome.
CLUSTER_UNORDERED-REGION_CHROMOSOME-NUMBER: Chromosomal band where this H-Inv cluster mapped if the cDNA/cluster is mapped on unordered part(random) of the chromosome.
CLUSTER_UNORDERED-REGION_LOCATION_NCBI-CONTIG: NCBI Contigs (prefix NT_ or NW_) record if the cDNA/cluster is mapped on unordered part (random) of the chromosome.
CLUSTER_UNORDERED-REGION_LOCATION_START: Start position of the location on the chromosome where this H-Inv cluster mapped if the cDNA/cluster is mapped on unordered part(random) of the chromosome.
CLUSTER_UNORDERED-REGION_LOCATION_END: End position of the location on the chromosome where this H-Inv cluster mapped if the cDNA/cluster is mapped on unordered part(random) of the chromosome.
CLUSTER_UNORDERED-REGION_LOCATION_STRAND: NCBI Contigs (prefix NT_ or NW_) record if the cDNA/cluster is mapped on unordered part (random) of the chromosome.
CLUSTER_EXPRESSION_DATE-LAST-UPDATE: Date of last-updated for tissue/organ expression profile section of this H-inv cluster
CLUSTER_EXPRESSION_DATE-LAST-MODIFIED: Date of last-modified for tissue/organ expression profile section of this H-inv cluster
CLUSTER_EXPRESSION_EXPRESSION-RELEASE: The release number of H-InvDB expression section.
CLUSTER_EXPRESSION_TISSUE-SPECIFIC-EXPRESSION: Organ/Tissue category if the expression pattern of this H-Inv cluster was determined as tissue-specific.  Tissue type was classified into 10 classes; (1)neural, (2)blood/spleen/LND, (3)dermal_connective, (4)placenta/testis/ovary, (5)muscle/heart, (6)stomach/colon, (7)liver, (8)lung, (9)kidney/bladder, (10)endocrine/exocrine.
CLUSTER_NOTES_POSSIBLE-FUSED-LOCUS: Feature of the cluster containing "possible fused cluster"(Note: stated by human curation only).
CLUSTER_AS-PATTERN: 5'terminal pattern of alternative splicing was suggested; the pattern of alternative splicing of this H-inv cluster determined by both human curation and computational analysis (see details in "Annotation policy" section).
					: 3'terminal pattern of alternative splicing was suggested.; the pattern of alternative splicing of this H-inv cluster determined by both human curation and computational analysis (see details in "Annotation policy" section).
					: Internal AS pattern of alternative splicing was suggested; the pattern of alternative splicing of this H-inv cluster determined by both human curation and computational analysis (see details in "Annotation policy" section.
					: Cassette pattern of alternative splicing was suggested; the pattern of alternative splicing of this H-inv cluster determined by both human curation and computational analysis (see details in "Annotation policy" section.
					: Internal acceptor site pattern of alternative splicing was suggested; the pattern of alternative splicing of this H-inv cluster determined by both human curation and computational analysis (see details in "Annotation policy" section.
					: Internal donor site pattern of alternative splicing was suggested; the pattern of alternative splicing of this H-inv cluster determined by both human curation and computational analysis (see details in "Annotation policy" section.
					: Mutually exclusive pattern of alternative splicing was suggested; the pattern of alternative splicing of this H-inv cluster determined by both human curation and computational analysis (see details in "Annotation policy" section.
					: Retained intron pattern of alternative splicing was suggested; the pattern of alternative splicing of this H-inv cluster determined by both human curation and computational analysis (see details in "Annotation policy" section.
CLUSTER_SPLICING-ISOFORM_ACCESSION-NO: DNA databank accession number of the alternative isoform determined by both human curation and computational analysis (see details in "Annotation policy" section.
CLUSTER_SPLICING-ISOFORM_ACCESSION-NO-VERSION: DNA databank accession number of the alternative isoform determined by both human curation and computational analysis (see details in "Annotation policy" section.
MAPPED_MEMBER_H-INVITATIONAL-ID: H-Inv ID (ID for individual curated cDNA assigned by H-Invitational database) of the member of this H-Inv cluster.
MAPPED_MEMBER_H-INVITATIONAL-ID-VERSION: H-Inv version ID of the member of this H-Inv cluster.
MAPPED_MEMBER_ACCESSION-NO: DNA databank accession No. of the member of this H-Inv cluster.
MAPPED_MEMBER_ACCESSION-NO-VERSION: DNA databank accession No. of the member of this H-Inv cluster.
MAPPED_MEMBER_CURATION-STATUS: A flag indicating the curation status of this H-Inv transcript, whether it is "human-curated" or "auto-annotated".
MAPPED_MEMBER_DATA-SOURCE_DB-REFERENCE_PROTEIN-MOTIF-ID: Protein ID or InterPro ID of known non-hypothetical protein which this locus member H-Inv transcript was identical to or similar to, or InterPro domain.
MAPPED_MEMBER_DATA-SOURCE_IDENTITY: % identity to known non-hypothetical protein which this  locus member H-Inv transcript was identical to or similar to.
MAPPED_MEMBER_DATA-SOURCE_COVERAGE: % coverage to known non-hypothetical protein which this  locus member H-Inv transcript was identical to or similar to.
MAPPED_MEMBER_DATA-SOURCE_HOMOLOGOUS_SPECIES: Species of known non-hypothetical protein which this  locus member H-Inv transcript was identical to or similar to.
MAPPED_MEMBER_DEFINITION: Functional definition of this H-Inv transcript.
MAPPED_MEMBER_SIMILARITY-CATEGORY: Similarity category of protein-coding transcript according to its sequence similarity to known non-hypothetical protein. See details in "Annotation policy" section.
MAPPED_MEMBER_GENE-NAME_GENEW: HUGO approved gene symbol of the cluster member.
MAPPED_MEMBER_INCOMPLETE-SPLICING-REVISED: The flag for cDNA of which remaining intronic sequences were revised.
MAPPED_MEMBER_FRAMESHIFT-ERROR-REVISED: The flag for cDNA of which predicted frameshift error was revised.
MAPPED_MEMBER_DB-REFERENCE_ENSEMBL: Corresponding Ensembl ID of this cluster member.
MAPPED_MEMBER_DB-REFERENCE_ENTERZGENE: Corresponding Entrez Gene ID of this cluster member.
MAPPED_MEMBER_DB-REFERENCE_REFSEQ: Corresponding Refseq ID of this cluster member.
MAPPED_MEMBER_GENOME_START: Start position of the location on the chromosome to which this cluster member mapped.
MAPPED_MEMBER_GENOME_END: End position of the location on the chromosome to which this cluster member mapped.
MAPPED_MEMBER_LOCUS_START: Start position of the location in this cluster to which this cluster member mapped.
MAPPED_MEMBER_LOCUS_END: End position of the location in this cluster to which this cluster member mapped.
MAPPED_MEMBER_EXON-CDNA_START: Start position of exon of this cluster member ;region of cDNA that codes for portion of spliced mRNA may contain 5'UTR, all CDSs and 3' UTR.
MAPPED_MEMBER_EXON-CDNA_END: End position of exon of this cluster member ;region of cDNA that codes for portion of spliced mRNA may contain 5'UTR, all CDSs and 3' UTR.
MAPPED_MEMBER_EXON-GENOME_START: Start position of exon of this cluster member ;region of genome that codes for portion of spliced mRNA may contain 5'UTR, all CDSs and 3' UTR.
MAPPED_MEMBER_EXON-GENOME_END: End position of exon of this cluster member ;region of genome that codes for portion of spliced mRNA may contain 5'UTR, all CDSs and 3' UTR.
PATHOLOGY_DATE-LAST-UPDATE: Date of last-updated for H-InvDB DiseaseInfo section.
PATHOLOGY_DATE-LAST-MODIFIED: Date of last-modified for H-InvDB DiseaseInfo section.
PATHOLOGY_DISEASE-RELEASE: The release number of H-InvDB DiseaseInfo section.
PATHOLOGY_KNOWN-DISEASE-GENE_CDNA-OMIM_DB-REFERENCE_OMIM-DISEASE: Online Mendelian Inheritance in Man (OMIM) ID of known-disease assigned to this cluster.
PATHOLOGY_KNOWN-DISEASE-GENE_CDNA-OMIM_DISEASE_DISEASE-NAME: OMIM title (Disease name) registered for OMIM ID of known-disease assigned to this cluster.
PATHOLOGY_KNOWN-DISEASE-GENE_CDNA-OMIM_DISEASE-RELATION: OMIM title (Disease name) of known-disease assigned to this cluster without OMIM ID.
PATHOLOGY_KNOWN-DISEASE-GENE_CDNA-OMIM_DB-REFERENCE_OMIM: Online Mendelian Inheritance in Man (OMIM) ID assigned to this cluster by relation of protein.
PATHOLOGY_KNOWN-DISEASE-GENE_CDNA-OMIM_MORBID-MAP_OMIM-TITLE: OMIM title assigned to this cluster by relation of protein without OMIM ID.
PATHOLOGY_ORPHAN-DISEASE_DB-REFERENCE_LOCUS-OMIM: Online Mendelian Inheritance in Man (OMIM) ID of co-localized disease assigned to the chromosomal location.
//: The end of an entry

5.3 cDNA information

Domain_name:Description
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CDNA_CLUSTER-ID:H-Inv cluster ID (ID for the cluster of cDNA(s) assigned by H-Invitational database) of this cluster.
CDNA_CLUSTER-ID-VERSION:H-Inv version cluster ID of this cluster.
CDNA_H-INVITATIONAL-ID:H-Inv ID (ID for individual curated cDNA assigned by H-Invitational database) of this transcript.
CDNA_H-INVITATIONAL-ID-VERSION:H-Inv version ID of this transcript.
CDNA_ACCESSION-NO:DNA databank accession number of cDNA.
CDNA_ACCESSION-NO-VERSION:DNA databank accession number plus a version number after the decimal point.
CDNA_GENOME-STATISTICS:Version information of the genome sequence which the release is based on.
CDNA_DATE-CREATED:Date of creation of this H-Inv cDNA entry in H-Inv database.
CDNA_DATE-LAST-UPDATE:Date of last-updated of this H-inv cDNA entry
CDNA_DATE-LAST-MODIFIED:Date of last-modified of this H-inv cDNA entry
CDNA_H-INV-DB-RELEASE:The release number of H-InvDB
CDNA_HIT-RELEASE:The release number of H-InvDB HIT annotation.
CDNA_DATE-ANNOTATION-LAST-UPDATE:Date of last-updated for H-InvDB cDNA annotation.
CDNA_DATE-ANNOTATION-LAST-MODIFIED:Date of last-modified for H-InvDB cDNA annotation.
CDNA_HIT-ANNOTATION-RELEASE:Date of first-release for H-InvDB cDNA annotation.
CDNA_CHROMOSOME-NUMBER:Chromosomal band where the cDNA mapped.
CDNA_BAND_CHROMOSOME-BAND:Chromosomal number where the cDNA mapped.
CDNA_START:Start position of the location on the chromosome where the cDNA mapped
CDNA_END:End position of the location on the chromosome where the cDNA mapped
CDNA_STRAND:Strand to which the cDNA mapped on the genome; forward or reverse.:Strand to which the cDNA mapped on the genome; forward or reverse.
CDNA_UNORDERED-REGION_RANDOM-FLAG:Flag for if the cDNA/cluster is mapped on unordered part(random) of the chromosome.
CDNA_UNORDERED-REGION_CHROMOSOME-NUMBER:Chromosomal number where the cDNA mapped if the cDNA/cluster is mapped on unordered part(random) of the chromosome.
CDNA_UNORDERED-REGION_LOCATION_NCBI-CONTIG:NCBI Contigs (prefix NT_ or NW_) record if the cDNA/cluster is mapped on unordered part (random) of the chromosome.
CDNA_UNORDERED-REGION_LOCATION_START:Start position of the location on the chromosome where the cDNA mapped if the cDNA/cluster is mapped on unordered part(random) of the chromosome.
CDNA_UNORDERED-REGION_LOCATION_END:End position of the location on the chromosome where the cDNA mapped if the cDNA/cluster is mapped on unordered part(random) of the chromosome.
CDNA_UNORDERED-REGION_LOCATION_STRAND:Strand to which the cDNA mapped on the genome if the cDNA/cluster is mapped on unordered part(random) of the chromosome.
CDNA_LOCUS_START:Start position of the location within this cluster.
CDNA_LOCUS_END:End position of the location within this cluster.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_CLUSTER-ID:H-Inv cluster ID if this H-Inv transcript is mapped on human genome at more than one location for an identical condition.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_CHROMOSOME-NUMBER:Chromosomal number of H-Inv cluster mapped if this H-Inv transcript is mapped on human genome at more than one location for an identical condition.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_START:Start position on the genome of H-Inv cluster mapped if this H-Inv transcript is mapped on human genome at more than one location for an identical condition.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_END:End position of H-Inv cluster mapped if this H-Inv transcript is mapped on human genome at more than one location for an identical condition.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_STRAND: Strand to which H-Inv cluster mapped on the genome; forward or reverse, for the clusters mapped on human genome at more than one location for an identical condition.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_UNORDERED-REGION_RANDOM-FLAG:Flag for if the cDNA/cluster is mapped on unordered part(random) of the chromosome, for the clusters mapped on human genome at more than one location for an identical condition.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_UNORDERED-REGION_CHROMOSOME-NUMBER:Chromosomal number of H-Inv cluster mapped if this H-Inv transcript is mapped on human genome at more than one location for an identical condition.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_UNORDERED-REGION_LOCATION_NCBI-CONTIG:NCBI Contigs (prefix NT_ or NW_) record if the cDNA/cluster is mapped on unordered part (random) of the chromosome, for the clusters mapped on human genome at more than one location for an identical condition.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_UNORDERED-REGION_LOCATION_START:Start position on the genome of H-Inv cluster mapped if this H-Inv transcript is mapped on human genome at more than one location for an identical condition.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_UNORDERED-REGION_LOCATION_END:End position of H-Inv cluster mapped if this H-Inv transcript is mapped on human genome at more than one location for an identical condition.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_UNORDERED-REGION_LOCATION_STRAND:Strand to which H-Inv cluster mapped on the genome; forward or reverse, for the clusters mapped on human genome at more than one location for an identical condition.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_CLUSTER-ID:H-Inv cluster ID if the cDNA is mapped on human genome at more than one location for not identical but above our standard.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_CHROMOSOME-NUMBER:Chromosomal number of H-Inv cluster mapped if the CDNA is mapped on human genome at more than one location for not identical but above our standard.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_START:Start position on the genome of H-Inv cluster mapped if the CDNA is mapped on human genome at more than one location for not identical but above our standard.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_END:End position of H-Inv cluster mapped if the cDNA is mapped on human genome at more than one location for not identical but above our standard.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_STRAND: Strand to which H-Inv cluster mapped on the genome; forward or reverse, for the clusters mapped on human genome at more than one location for not identical but above our standard.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_UNORDERED-REGION_RANDOM-FLAG:Flag for if the cDNA/cluster is mapped on unordered part(random) of the chromosome, for the clusters mapped on human genome at more than one location for not identical but above our standard.
CDNA_LOCUS_POSSIBLE-DUPLICATED-LOCI_UNORDERED-REGION_NCBI-CONTIG:NCBI Contigs (prefix NT_ or NW_) record if the cDNA/cluster is mapped on unordered part (random) of the chromosome, for the clusters mapped on human genome at more than one location for not identical but above our standard.
CDNA_DB-REFERENCE_KEGG-GENES:Registration of the cDNA in KEGG-GENES; metabolic pathway database.
CDNA_DB-REFERENCE_ENSEMBL:Corresponding Ensembl ID
CDNA_DB-REFERENCE_ENTREZGENE:Corresponding Entrez Gene ID.
CDNA_DB-REFERENCE_REFSEQ:Corresponding Refseq ID
CDNA_REP-H-INVITATIONAL:Flag to show if the transcript is 'representative transcript' of the cluster.:Flag to show if the transcript is 'representative transcript' of the cluster.
CDNA_SPLICING-ISOFORM_CURATION:Judgment of the splicing isoform,:Judgment of the splicing isoform,
CDNA_DB-REFERENCE_PUBMED:PubMed ID reference to detasource protein ID of the detasource which functional definition was based on.
JOURNAL_DB-REFERENCE_MEDLINE:Publication journal of H-Invitational database. 
MRNA-INSPECTION_CLONE-NUMBER_ORIGINAL:Clone ID/number from which the sequence was obtained of original nucleotide sequence.
MRNA-INSPECTION_CLONE-NUMBER_REVISED:Clone ID/number from which the sequence was obtained of revised nucleotide sequence.
MRNA-INSPECTION_CELL-TYPE-ORIGIN_ORIGINAL:Cell type from which the sequence was obtained of original nucleotide sequence.
MRNA-INSPECTION_CELL-TYPE-ORIGIN_REVISED:Cell type from which the sequence was obtained of revised nucleotide sequence.
MRNA-INSPECTION_TISSUE-TYPE-ORIGIN_ORIGINAL:Tissue type from which the sequence was obtained using original nucleotide sequence.
MRNA-INSPECTION_TISSUE-TYPE-ORIGIN_REVISED:Tissue type from which the sequence was obtained of revised nucleotide sequence.
MRNA-INSPECTION_DEV-STAGE_ORIGINAL:Developmental stage from which the sequence was obtained of original nucleotide sequence, if the sequence was obtained from an organism in a specific developmental stage
MRNA-INSPECTION_DEV-STAGE_REVISED:Developmental stage from which the sequence was obtained of revised nucleotide sequence, if the sequence was obtained from an organism in a specific developmental stage.
MRNA-INSPECTION_DATA-PROVIDER:Institute or organization from which original cDNA sequence was provided.
MRNA-INSPECTION_LENGTH-OF-CDNA_ORIGINAL:Length of original cDNA.
MRNA-INSPECTION_LENGTH-OF-CDNA_REVISED:Length of curated cDNA.
MRNA-INSPECTION_NUMBER-OF-EXON_ORIGINAL:Number of exon(s) of original cDNA.
MRNA-INSPECTION_NUMBER-OF-EXON_REVISED:Number of exon(s) of curated cDNA.
MRNA-INSPECTION_NUMBER-OF-A_ORIGINAL:Number of adenines in original cDNA.
MRNA-INSPECTION_NUMBER-OF-A_REVISED:Number of adenines in curated cDNA.
MRNA-INSPECTION_NUMBER-OF-T_ORIGINAL:Number of thymines in original cDNA.
MRNA-INSPECTION_NUMBER-OF-T_REVISED:Number of thymines in curated cDNA.
MRNA-INSPECTION_NUMBER-OF-G_ORIGINAL:Number of guanines in original cDNA.
MRNA-INSPECTION_NUMBER-OF-G_REVISED:Number of guanines in curated cDNA.
MRNA-INSPECTION_NUMBER-OF-C_ORIGINAL:Number of cytosines in original cDNA.
MRNA-INSPECTION_NUMBER-OF-C_REVISED:Number of cytosines in curated cDNA.
MRNA-INSPECTION_INCOMPLETE-SPLICING_REMAINING-INTRON:Judgement whether remaining intron was revised.:Judgement whether remaining intron was revised.
MRNA-INSPECTION_INCOMPLETE-SPLICING_REVISED-INTRON_START:Start position of translation in curated cDNA.
MRNA-INSPECTION_INCOMPLETE-SPLICING_REVISED-INTRON_END:End position of translation in curated cDNA.
MRNA-INSPECTION_FRAME-SHIFT_START:Start position of revised frameshift error in the cDNA.
MRNA-INSPECTION_FRAME-SHIFT_END:End position of revised frameshift error in the cDNA.
MRNA-INSPECTION_FRAME-SHIFT_BASE:Base code of original cDNA of frameshift error.
MRNA-INSPECTION_FRAME-SHIFT_BASE:Base code of revised frameshift error.
MRNA-INSPECTION_FRAME-SHIFT_INDEL:Type of revised frameshift error, insertion or deletion.:Type of revised frameshift error, insertion or deletion.
PATHOLOGY_DATE-LAST-UPDATE:Date of last-updated for H-InvDB DiseaseInfo section.
PATHOLOGY_DATE-LAST-MODIFIED:Date of last-modified for H-InvDB DiseaseInfo section.
PATHOLOGY_DISEASE-RELEASE:The release number of H-InvDB DiseaseInfo section.
PATHOLOGY_KNOWN-DISEASE-GENE_CDNA-OMIM_DB-REFERENCE_OMIM-DISEASE:Online Mendelian Inheritance in Man (OMIM) ID of known-disease assigned to this cluster.
PATHOLOGY_KNOWN-DISEASE-GENE_CDNA-OMIM_DISEASE_DISEASE-NAME:OMIM title (Disease name) registered for OMIM ID of known-disease assigned to this cluster.
PATHOLOGY_KNOWN-DISEASE-GENE_CDNA-OMIM_DISEASE-RELATION:OMIM title (Disease name) of known-disease assigned to the cluster which this HIT is mapped without OMIM ID.
PATHOLOGY_KNOWN-DISEASE-GENE_CDNA-OMIM_DB-REFERENCE_OMIM:Online Mendelian Inheritance in Man (OMIM) ID assigned to the cluster which this HIT is mapped by relation of protein.
PATHOLOGY_KNOWN-DISEASE-GENE_CDNA-OMIM_MORBID-MAP_OMIM-TITLE:OMIM title assigned to the cluster which this HIT is mapped by relation of protein without OMIM ID.
PATHOLOGY_ORPHAN-DISEASE_DB-REFERENCE_LOCUS-OMIM:Online Mendelian Inheritance in Man (OMIM) ID of co-localized disease assigned to the chromosomal location.
EVOLUTIONARY-FEATURE_DATE-LAST-UPDATE:Date of last-updated for evolutionary section of this H-inv cDNA entry
EVOLUTIONARY-FEATURE_DATE-LAST-MODIFIED:Date of last-modified for evolutionary section of this H-inv cDNA entry
EVOLUTIONARY-FEATURE_EVOLUTION-RELEASE:The release number of H-InvDB Evolution section.
EVOLUTIONARY-FEATURE_CHIMPANZEE_ACCESSION-NO:DNA databank accession number of the chimpanzee ortholog of the human cDNA.
EVOLUTIONARY-FEATURE_CHIMPANZEE_SPECIES:Pan troglodytes, chimpanzee
EVOLUTIONARY-FEATURE_CHIMPANZEE_ORTHOLOGY:A flag indicating the curation status of this orthologous candidate.
EVOLUTIONARY-FEATURE_CRMONKEY_ACCESSION-NO:DNA databank accession number of the crab-eating macaque ortholog of the human cDNA.
EVOLUTIONARY-FEATURE_CRMONKEY_SPECIES:Macaca fascicularis, crab-eating macaque
EVOLUTIONARY-FEATURE_CRMONKEY_ORTHOLOGY:A flag indicating the curation status of this orthologous candidate.
EVOLUTIONARY-FEATURE_RHMONKEY_ACCESSION-NO:DNA databank accession number of the rhesus monkey ortholog of the human cDNA.
EVOLUTIONARY-FEATURE_RHMONKEY_SPECIES:Macaca mulatta, rhesus monkey
EVOLUTIONARY-FEATURE_RHMONKEY_ORTHOLOGY:A flag indicating the curation status of this orthologous candidate.
EVOLUTIONARY-FEATURE_MOUSE_ACCESSION-NO:DNA databank accession number of the mouse ortholog of the human cDNA.
EVOLUTIONARY-FEATURE_MOUSE_SPECIES:Mus musculus, house mouse
EVOLUTIONARY-FEATURE_MOUSE_ORTHOLOGY:A flag indicating the curation status of this orthologous candidate.
EVOLUTIONARY-FEATURE_RAT_ACCESSION-NO:DNA databank accession number of the rat ortholog of the human cDNA.
EVOLUTIONARY-FEATURE_RAT_SPECIES:Rattus norvegicus, Norway rat
EVOLUTIONARY-FEATURE_RAT_ORTHOLOGY:A flag indicating the curation status of this orthologous candidate.
CDNA-INFO_SNP_START:Start position of SNP; single nucleotide polymorphism [dbSNP format].
CDNA-INFO_SNP_END:End position of SNP; single nucleotide polymorphism [dbSNP format].
CDNA-INFO_SNP_DB-REFERENCE_DBSNP:dbSNP ID of the determined SNP; single nucleotide polymorphism [dbSNP format].
CDNA-INFO_SNP_BASE-DBSNP:Codon code of SNP; single nucleotide polymorphism [dbSNP format].
CDNA-INFO_SNP_BASE-CDNA:Codon code of SNP; single nucleotide polymorphism; qualifier / replace [DNA databank format].
CDNA-INFO_SNP_LOCATION: Location of SNP; 5'-UTR, ORF or 3'-UTR.
CDNA-INFO_SNP_TRANSLATION: Effect of SNP in translation; synonymous, nonsynonymous or termination.
CDNA-INFO_SNP_STRAND:Orientation of SNP; single nucleotide polymorphism.
CDNA-INFO_ADAPTER_START:Start position of adapter sequence of the clonig vector
CDNA-INFO_ADAPTER_END:End position of adapter sequence of the clonig vector
CDNA-INFO_ADAPTER_TYPE:Sequence of adapter sequence of the clonig vector
CDNA-INFO_POLYA-SIGNAL_START:Start position of polyA signal; recognition region necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA
CDNA-INFO_POLYA-SIGNAL_END:End position of polyA signal; recognition region necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA
CDNA-INFO_POLYA-SIGNAL_BASE:Nucleotide sequence of polyA signal; recognition region necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA
CDNA-INFO_POLYA-SIGNAL_STRAND:Strand of polyA signal; recognition region necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA
CDNA-INFO_POLYA_SITE:Site on an RNA transcript to which will be added adenine residues by post-transcriptional polyadenylation.
CDNA-INFO_POLYA_STRAND:Strand on an RNA transcript to which will be added adenine residues by post-transcriptional polyadenylation.
CDNA-INFO_EXON-CDNA_START:Start position of exon ;region of cDNA that codes for portion of spliced mRNA that may contain 5'UTR, all CDSs and 3' UTR.
CDNA-INFO_EXON-CDNA_END:End position of exon ;region of cDNA that codes for portion of spliced mRNA that may contain 5'UTR, all CDSs and 3' UTR.
CDNA-INFO_EXON-GENOME_START:Start position of exon ;region of genome that codes for portion of spliced mRNA that may contain 5'UTR, all CDSs and 3' UTR.
CDNA-INFO_EXON-GENOME_END:End position of exon ;region of genome that codes for portion of spliced mRNA that may contain 5'UTR, all CDSs and 3' UTR.
CDNA-INFO_REPEAT_START:Start position of the region of cDNA containing repeating units.
CDNA-INFO_REPEAT_END:End position of the region of cDNA containing repeating units.
CDNA-INFO_REPEAT_TYPE:Type of the region of cDNA containing repeating units
CDNA-INFO_REPEAT_STRAND:Strand of the region of cDNA containing repeating units
CDNA-INFO_MICROSATELLITE_START:Start position of microsatellite; many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA
CDNA-INFO_MICROSATELLITE_END:End position of microsatellite; many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA
CDNA-INFO_MICROSATELLITE_UNIT:Unit of microsatellite; many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA
CDNA-INFO_MICROSATELLITE_STRAND:Strand of microsatellite; many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA
SEQUENCE_NUCLEOTIDE_ORIGINAL:Nucleotide sequence of original cDNA
SEQUENCE_NUCLEOTIDE_REVISED:Nucleotide sequence of revised cDNA
PREDICTED-ORF_CODING-POTENTIAL:Protein coding potential of cDNA, classified into three categories; "Protein-coding transcript", "Non-protein-coding transcript" and "Questionable transcript".  The classification is based on both human curation and computational prediction of ORF.
PREDICTED-ORF_NON-PROTEIN-CODING-TRANSCRIPTS:Classification of non-protein-coding transcripts; ncRNA, uncharactatized transcript, unclassifiable transcript or hold transcript.  The classification was based on human curation for all of the non-protein-coding transcripts.
PREDICTED-ORF_CDS_SOURCE_START:Start position on curated cDNA.
PREDICTED-ORF_CDS_SOURCE_END:End position on curated cDNA.
PREDICTED-ORF_CDS_CDS_START:Start position on curated cDNA; the sequence of nucleotides that corresponds to the sequence of amino acids in a protein.
PREDICTED-ORF_CDS_CDS_END:End position on curated cDNA; the sequence of nucleotides that corresponds to the sequence of amino acids in a protein (location includes stop codon).
PREDICTED-ORF_CDS_LENGTH-OF-CDS:Amino acid length of the transcript.
PREDICTED-ORF_INCOMPLETE-SPLICING-REVISED:The flag for cDNA of which remaining intronic sequences were revised.
PREDICTED-ORF_FRAMESHIFT-ERROR-REVISED:The flag for cDNA of which predicted frameshift error was revised.
PREDICTED-ORF_ORIENTATION:The orientation of the translation of cDNA.
PREDICTED-ORF_CAI:Codon Adaptation Index of predicted ORFs determined by EMBOSS (http://biobase.dk/embossdocs/cai.html).
FUNCTION_DEFINITION_DATA-SOURCE_DB-REFERENCE_PROTEIN-MOTIF-ID:Protein ID or InterPro ID of known non-hypothetical protein which this H-Inv transcript was identical to or similar to, or InterPro domain.
FUNCTION_DEFINITION_DATA-SOURCE_IDENTITY:% identity to known non-hypothetical protein which this H-Inv transcript was identical to or similar to.
FUNCTION_DEFINITION_DATA-SOURCE_COVERAGE:% coverage to known non-hypothetical protein which this H-Inv transcript was identical to or similar to.
FUNCTION_DEFINITION_DATA-SOURCE_HOMOLOGOUS_SPECIES:Species of known non-hypothetical protein which this H-Inv transcript was identical to or similar to.
FUNCTION_DEFINITION_DATA-SOURCE_CURATION-STATUS:A flag indicating the curation status of this H-Inv transcript, whether it is "human-curated" or "auto-annotated".
FUNCTION_DEFINITION_DATA-SOURCE_DEFINITION:Functional definition of this H-Inv transcript.
FUNCTION_DEFINITION_DATA-SOURCE_SIMILARITY-CATEGORY:Similarity category of protein-coding transcript according to its sequence similarity to known non-hypothetical protein. See details in "Annotation policy" section.
FUNCTION_DEFINITION_GENENAME-SYMBOL_DDBJ-EMBL-GenBank:Gene name of this H-Inv transcript assigned by DNA databank.
FUNCTION_DEFINITION_GENENAME-SYMBOL_PROTEIN_SWISS-PROT:Gene name of this H-Inv transcript assigned by Swiss-Prot/TrEMBL or RefSeq which the transcript is identical or similar to.
FUNCTION_DEFINITION_GENENAME-SYMBOL_H-INV:Gene name of this H-Inv transcript originally assigned by H-Invitational database.
FUNCTION_DEFINITION_HUGO-APPROVED-GENE-SYMBOL_GENEW:HUGO approved gene name of this H-Inv transcript related to SWISS-PROT Protein ID.
FUNCTION_DEFINITION_HUGO-APPROVED-GENE-NEW-NAME:HUGO approved gene name of this H-Inv transcript assigned by H-InvDB
FUNCTION_DEFINITION_EC-NO_DB-REFERENCE_DDBJ-EMBL-GenBank:Enzyme Commission number for enzyme product of this H-Inv transcript recorded at DNA databank
FUNCTION_DEFINITION_EC-NO_DB-REFERENCE_SWISS-PROT:Enzyme Commission number for enzyme product of this H-Inv transcript assigned by data source protein ID of functional definition.
FUNCTION_DEFINITION_EC-NO_DB-REFERENCE_INTERPRO-GO:Enzyme Commission number for enzyme product of this H-Inv transcript assigned from GO ID
FUNCTION_DEFINITION_EC-NO_DB-REFERENCE_H-INV:Unique Enzyme Commission number for enzyme product of this H-Inv transcript determined by H-InvDB
FUNCTION_DEFINITION_NOTES_POSSIBLE-GENOMIC-FRAGMENT:Note of human curation for possible genomic fragment
FUNCTION_SUPPLEMENTARY_DATA-SOURCE_DB-REFERENCE_PROTEIN-MOTIF-ID:Data source ID of human curation for supplementary information; "conserved hypothetical protein", "second-meaningful hit", "multi-functional" and "possible motif"
FUNCTION_SUPPLEMENTARY_DATA-SOURCE_IDENTITY:% identity to data source ID of human curation for supplementary information; "conserved hypothetical protein", "second-meaningful hit", "multi-functional" and "possible motif"
FUNCTION_SUPPLEMENTARY_DATA-SOURCE_COVERAGE:% coverage to data source ID of human curation for supplementary information; "conserved hypothetical protein", "second-meaningful hit", "multi-functional" and "possible motif"
FUNCTION_SUPPLEMENTARY_DATA-SOURCE_HOMOLOGOUS_SPECIES:Species of data source ID of human curation for supplementary information; "conserved hypothetical protein", "second-meaningful hit", "multi-functional" and "possible motif"
FUNCTION_SUPPLEMENTARY_DATA-SOURCE_DEFINITION:Function for supplementary information; "conserved hypothetical protein", "second-meaningful hit", "multi-functional" and "possible motif"
FUNCTION_SUPPLEMENTARY_NOTES_CONSERVED-HYPOTHETICAL:Note of human curation for supplementary information; "conserved hypothetical protein".
FUNCTION_GENE-ONTOLOGY_DOMAIN_DB-REFERENCE_INTERPRO:InterPro ID of predicted InterPro motif in the CDS.
FUNCTION_GENE-ONTOLOGY_DOMAIN_NAME:Name of predicted InterPro motif in the CDS.
FUNCTION_GENE-ONTOLOGY_DOMAIN_TYPE:Type of predicted InterPro motif in the CDS; family, domain, repeat or PTM (post-translational modification).
FUNCTION_GENE-ONTOLOGY_DOMAIN_MOTIF_START:Start position of predicted InterPro motif in the CDS.
FUNCTION_GENE-ONTOLOGY_DOMAIN_MOTIF_END:End position of predicted InterPro motif in the CDS.
FUNCTION_GENE-ONTOLOGY_DOMAIN_MOTIF_GO_ORGANIZING-PLINCIPLE:Gene ontology (GO) code.
FUNCTION_GENE-ONTOLOGY_DOMAIN_MOTIF_GO_DB-REFERENCE_GO:Gene ontology (GO) ID assigned as a result of InterProScan by a option of InterProScan.
FUNCTION_GENE-ONTOLOGY_DOMAIN_MOTIF_GO_TERM:Gene ontology (GO) term assigned as a result of InterProScan by a option of InterProScan.
FUNCTION_CELLULAR-LOCATION_DATE-LAST-UPDATE:Date of last-updated for subcellular localization section of this H-inv cDNA entry
FUNCTION_CELLULAR-LOCATION_DATE-LAST-MODIFIED:Date of last-modified for subcellular localization section of this H-inv cDNA entry
FUNCTION_CELLULAR-LOCATION_SUBCELLULAR-RELEASE:The release number of H-InvDB Subcellular localization prediction section.
FUNCTION_CELLULAR-LOCATION_LOCATION_WPSORT:The WoLF PSORT program predict the subcellular localization of a protein to one (or more) of twelve cellular compartments, including: (1) chloroplast, chlo; (2) cytosol, cyto; (3) cytoskeleton, cysk; (4) endoplasmic reticulum, E.R; (5) extracellular, extr; (6) golgi apparatus, golg; (7) lysosome, lyso; (8) mitochondria, mito; (9) nuclear, nucl; (10) peroxisome, pero; (11) plasma membrane, plas; (12) vacuolar membrane, vacu.
.
FUNCTION_CELLULAR-LOCATION_TARGET-P:The TargetP program predicts targeting signal to the mitochondria (mit) or other compartments (other) . It also predicts the presence of a signal peptide (SP). In cases that no signal peptide is found the query sequence will be assigned an asterisk (*) meaning that no signal peptide was predicted.
FUNCTION_CELLULAR-LOCATION_SOSUI:SOUSI program predicts the presence of transmembrane [TM]  helices in proteins, as a result proteins are classified into: (1) 'transmembrane protein' if one or more TM are predicted; (2) 'soluble protein' if  no TM is predicted; (4) 'unidentified' if  a protein encode a very short ORF.
FUNCTION_CELLULAR-LOCATION_TMHMM:TMHMM program predicts the presence of transmembrane [TM]  helices in proteins, as a result proteins are classified into: (1) 'transmembrane protein' if one or more TM are predicted; (2) 'soluble protein' if  no TM is predicted; (3) 'unidentified' if  a protein encode a very short ORF.
PROTEIN-STRUCTURE_DATE-LAST-UPDATE:Date of last-updated for protein 3D-structure section of this H-inv cDNA entry
PROTEIN-STRUCTURE_DATE-LAST-MODIFIED:Date of last-modified for protein 3D-structure section of this H-inv cDNA entry
PROTEIN-STRUCTURE_3D-STRUCTURE-RELEASE:The release number of H-InvDB 3D-structure prediction section.
PROTEIN-STRUCTURE_SECONDARY-TERTIARY-STRUCTURE_DB-REFERENCE_GTOP:GTOP ID
PROTEIN-STRUCTURE_SECONDARY-TERTIARY-STRUCTURE_STRUCTURE_START:Start position of the amino acid which shows homology to the known protein structure
PROTEIN-STRUCTURE_SECONDARY-TERTIARY-STRUCTURE_STRUCTURE_END:End position of the amino acid which shows homology to the known protein structure
PROTEIN-STRUCTURE_SECONDARY-TERTIARY-STRUCTURE_STRUCTURE_DB-REFERENCE_PDB:PDB code of the amino acid which shows homology to the known protein structure
PROTEIN-STRUCTURE_SECONDARY-TERTIARY-STRUCTURE_STRUCTURE_EVALUE:E-value of the amino acid which shows homology to the known protein structure
PROTEIN-STRUCTURE_SECONDARY-TERTIARY-STRUCTURE_STRUCTURE_IDENTITY:Identity of the amino acid which shows homology to the known protein structure
PROTEIN-STRUCTURE_SECONDARY-TERTIARY-STRUCTURE_STRUCTURE_COVERAGE:Coverage of the amino acid which shows homology to the known protein structure
PROTEIN-STRUCTURE_SECONDARY-TERTIARY-STRUCTURE_STRUCTURE_DOMAIN_DB-REFERENCE_SCOP:SCOP ID (ID for the domain which shows homology to the known protein structure)
PROTEIN-STRUCTURE_SECONDARY-TERTIARY-STRUCTURE_STRUCTURE_DOMAIN_ANOMALOUS-STRUCTURE:Abnormality of the result by computer analysis
SEQUENCE_TRANSLATION:Translation of nucleotide sequence of this H-Inv transcript into amino acid sequences.
//:The end of entry.



Revised: March 30, 2007