*--------------------------------------------------------------------------------------------------* Release information of H-InvDB_8.0 http://www.h-invitational.jp Dataset fixed on July 30, 2011. Released on April 20, 2012. *--------------------------------------------------------------------------------------------------* --------------------------- H-InvDB statistics --------------------------- 1. number of H-Invitational transcripts (HIT) all HIT: 249,012 * protein coding transcripts: 227,437 * non-protein-coding transcripts: 20,465 * psudogene candidates: 1,110 2. number of H-Invitational clusters (HIX) all HIX: 45,177 * protein coding: 36,096 * non-protein-coding: 8,329 * psudogene candidates: 752 3. number of H-Invitational proteins (HIP) all HIP: 152,151 --------------------------- Human nucleotide datasets --------------------------- 1. Human full-length cDNA dataset The dataset contains sequences produced by six institutes. All the sequences are already in DDBJ/EMBL/GenBank. 2. Human mRNA dataset Human mRNA sequences registered in DDBJ/EMBL/GenBank other than full-length cDNA were extracted from DDBJ release 86 obtained on July 30, 2011. 3. Human genome dataset Repeat masked human genome assembly NCBI build 37.1 was obtained from UCSC. (GRCh37: UCSC hg19, Feb. 2009: human genome NCBI b37.1) --------------------------- Databases --------------------------- 1. RefSeq mRNA RefSeq curated mRNAs were obtained from NCBI on July 30, 2011. (RefSeq release 47) 2. Ensembl transcripts Ensembl transcripts were obtained from Ensembl on July 30, 2011. Ensembl [release 63] 2. RefSeq protein RefSeq proteins were obtained from NCBI on July 30, 2011. (RefSeq release 47) 3. UniProt(SWISS-PROT/TrEMBL) UniProt(SWISS-PROT/TrEMBL) entries were obtained from EBI on July 30, 2011. (Release 2011_07) 4. HUGO approved gene symbol http://www.gene.ucl.ac.uk/nomenclature/ Human gene name data fixed on July 30, 2011. 5. Entrez Gene database http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene Relations of H-InvDB genes to Entrez Genes were fixed on July 30, 2011. 6. dbSNP Relations of H-InvDB genes to dbSNP build132 were fixed on July 30, 2011. --------------------------- Contents --------------------------- README.txt :this file acc2hinv_id.txt.gz :Summary of the cDNA data provider and the Accession Number of INSD (DDBJ/EMBL/GenBank) versus the H-Invitational Identifiers new_del_update_hinvid.txt :List of new, deleted and updated H-Invitational IDs jbirc_ff/ :H-InvDB annotated data sets in jbirc-format (refer to http://www.h-invitational.jp/hinv/help/help_index.html or http://www.h-invitational.jp/hinv/dataset/download.cgi for more information) jbirc_xml/ :H-InvDB annotated data sets in jbirc-xml-format (refer to http://www.h-invitational.jp/hinv/help/help_index.html or http://www.h-invitational.jp/hinv/dataset/download.cgi for more information) analysis/ :H-InvDB annotated dataset by computational analysis sequence/ :H-InvDB annotated sequence datasets for transcript, protein and genmoe sequences *--------------------------------------------------------------------------------------------------*