当前位置:首页 > 基因芯片数据荟萃—胶质瘤预后分析
哈尔滨医科大学本科毕业论文
中文摘要
大量的基因组数据特别是微阵列数据都可以通过各种网络资源获得,例如从the Gene Expression Omnibus (GEO)中获得。现存的这些基因表达数据库的数据库接口,表达数据存储方式和临床meta数据注释等方面在格式上都存在不相容的问题,而且从不同的数据库得到的数据的注释也会有不一致的情况。这些缺陷导致寻找疾病预后基因时存在很大的困难。
原发性脑肿瘤中预后效果最差的就是脑胶质瘤,其预后与生物学特征、生长发生部位、手术方式等医疗手段有关,因为胶质瘤具有浸润生长的特征,对神经组织破坏较大,手术难以完全切除,绝大多数胶质瘤在手术和放化疗后复发概率仍较大。胶质瘤分为4个等级:I、II、III、IV。低等级的胶质瘤是高度分化的,患者也往往具有比较良好的预后效果;高等级的胶质瘤则预后效果较差。
基于此,利用经过整合了的胶质瘤的基因芯片表达数据作meta分析,这些数据都以统一的标准化来处理,并被映射到了HGNC的gene symbol上;继而利用R软件来进行meta分析;最后利用cox比例风险回归模型来寻找疾病预后的biomarker。
本研究的一个重要的应用就是利用多个独立的研究来检验之前作为假设提出的胶质瘤的预后基因,利用meta分析能对同一个课题的多项研究结果的一致性进行归纳概括,对同一课题的多项研究结果作系统性评价和总结,meta分析能够提高统计效能和效应值估计的精确度。
关键词:生存分析;生物标记;meta分析;预后;胶质瘤
1
基因芯片数据荟萃—胶质瘤预后分析
Meta-analysis and survival analysis of the gene expression of glioma
Abstract
A wealth of genomic data, in particular microarray data, is publicly available through diverse online resources. Major database of gene chip expression data, e.g. Array Express and the Gene Expression Omnibus (GEO).However, inconsistent formatting among database interfaces, expression data storage and clinical meta-data annotations present formidable obstacles to making efficient use of these resources. The database provides machine-rather than manually annotated data, resulting in reduced consistency of annotation across studies. These defects may cause great problems when we are searching for the disease Biomarker.
Glioma is a primary brain tumor which has the worst prognosis of tumor, its prognosis is related with biological characteristics, growth related parts, operation mode and many other treatment measures, because of glioma with infiltrative growth characteristics, damaging the nervous system , difficult to complete excision operation, the vast majority of glioma after operation and chemotherapy will probably recur . Glioma is divided into 4 grades: I, II, III, IV. Low grade gliomas are highly differentiated, sufferers often have a relatively well prognosis; high grade gliomas
usually have poor prognosis.
Based on that, I utilize 7 sets of data of the expression of the glioma gene chip to do meta-analysis. And gene expression data were collected from public databases and author websites, processed in a consistent manner and mapped uniformly to official Human Gene Nomenclature Committee (HGNC) gene symbols. And then we execute the meta analysis using R software. Finally, using Cox proportional hazards regression model to the prognosis of the disease biomarker.
An important application of my research is the use of multiple independent study to test the hypothesis before as glioma prognosis of biomarker, analysis of consistency can result a number of studies on the same topic was evaluated using meta, the results
2
哈尔滨医科大学本科毕业论文
of several studies on the same topic for system evaluation and summary, meta analysis statistical efficiency and effect value estimation accuracy.
Key words: survival analysis; biomarker; meta-analysis; prognosis; glioma
3
共分享92篇相关文档