High-throughout single nucleotide polymorphism detection technology and the existing knowledge provide strong support for mining the disease-related haplotypes and genes. In this study, first, we apply four kinds of haplotype identification methods (Confidence Intervals, Four Gamete Tests, Solid Spine of LD and fusing method of haplotype block) into high-throughout SNP genotype data to identify blocks, then use cluster analysis to verify the effectiveness of the four methods, and select the alco- holism-related SNP haplotypes through risk analysis. Second, we establish a mapping from haplotypes to alcoholism-related genes. Third, we inquire NCBI SNP and gene databases to locate the blocks and identify the candidate genes. In the end, we make gene function annotation by KEGG, Biocarta, and GO database. We find 159 haplotype blocks, which relate to the alcoholism most possibly on chromosome 1~22, including 227 haplotypes, of which 102 SNP haplotypes may increase the risk of alcoholism. We get 121 alcoholism-related genes and verify their reliability by the functional annotation of biology. In a word, we not only can handle the SNP data easily, but also can locate the disease-related genes pre- cisely by combining our novel strategies of mining alcoholism-related haplotypes and genes with ex- isting knowledge framework.
ZHANG RuiJie1, LI Xia1,2, JIANG YongShuai1, LIU GuiYou1, LI ChuanXing1, ZHANG Fan1, XIAO Yun1 & GONG BinSheng1 1 Department of Bioinformatics, Harbin Medical University, Harbin 150086, China
GESTs (gene expression similarity and taxonomy similarity), a gene functional prediction approach previously proposed by us, is based on gene expression similarity and concept similarity of functional classes defined in Gene Ontology (GO). In this paper, we extend this method to protein-protein interac-tion data by introducing several methods to filter the neighbors in protein interaction networks for a protein of unknown function(s). Unlike other conventional methods, the proposed approach automati-cally selects the most appropriate functional classes as specific as possible during the learning proc-ess, and calls on genes annotated to nearby classes to support the predictions to some small-sized specific classes in GO. Based on the yeast protein-protein interaction information from MIPS and a dataset of gene expression profiles, we assess the performances of our approach for predicting protein functions to “biology process” by three measures particularly designed for functional classes organ-ized in GO. Results show that our method is powerful for widely predicting gene functions with very specific functional terms. Based on the GO database published in December 2004, we predict some proteins whose functions were unknown at that time, and some of the predictions have been confirmed by the new SGD annotation data published in April, 2006.
GAO Lei1, LI Xia1,2, GUO Zheng1,2, ZHU MingZhu1, LI YanHui1 & RAO ShaoQi1,3 1 Department of Bioinformatics, Harbin Medical University, Harbin 150086, China
<正>[Objective]To investigate whether single nucleotide polymorphisms(SNPs) in the Mn-superoxide dismutase gene...
LI Xu-dong,LIU Yi-min,GUO Xiao,LIU Bin,LIN Ai-hua,DING Yuan-lin,RAO Shao-qi 1.Guangdong Prevention and Treatment Center for Occupational Diseases,Guangzhou,China
Proteins rarely function in isolation inside and outside cells, but operate as part of a highly intercon- nected cellular network called the interaction network. Therefore, the analysis of the properties of drug-target proteins in the biological network is especially helpful for understanding the mechanism of drug action in terms of informatics. At present, no detailed characterization and description of the topological features of drug-target proteins have been available in the human protein-protein interac- tion network. In this work, by mapping the drug-targets in DrugBank onto the interaction network of human proteins, five topological indices of drug-targets were analyzed and compared with those of the whole protein interactome set and the non-drug-target set. The experimental results showed that drug-target proteins have higher connectivity and quicker communication with each other in the PPI network. Based on these features, all proteins in the interaction network were ranked. The results showed that, of the top 100 proteins, 48 are covered by DrugBank; of the remaining 52 proteins, 9 are drug-target proteins covered by the TTD, Matador and other databases, while others have been dem- onstrated to be drug-target proteins in the literature.
ZHU MingZhu1, GAO Lei1, LI Xia1,2 & LIU ZhiCheng1 1 School of Biomedical Engineering, Capital Medical University, Beijing 100069, China