Survival Analysis Approaches for Prostate Cancer By Eman Alhasawi A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science (MSc) in Computational Sciences The Faculty of Graduate Studies Laurentian University Sudbury, Ontario, Canada © Eman Alhasawi, 2015 THESIS DEFENCE COMMITTEE/COMITÉ DE SOUTENANCE DE THÈSE Laurentian Université/Université Laurentienne Faculty of Graduate Studies/Faculté des études supérieures Title of Thesis Titre de la thèse Survival Analysis Approaches for Prostate Cancer Name of Candidate Nom du candidat Alhasawi, Eman Degree Diplôme Master of Science Department/Program Date of Defence Département/Programme Computational Sciences Date de la soutenance April 15, 2015 APPROVED/APPROUVÉ Thesis Examiners/Examinateurs de thèse: Dr. Kalpdrum Passi (Supervisor/Directeur de thèse) Dr. Mazen Saleh (Committee member/Membre du comité) Dr. Hafida Boudjellaba (Committee member/Membre du comité) Approved for the Faculty of Graduate Studies Approuvé pour la Faculté des études supérieures Dr. David Lesbarrères M. David Lesbarrères Dr. Chakresh Jain Acting Dean, Faculty of Graduate Studies (External Examiner/Examinateur externe) Doyen intérimaire, Faculté des études supérieures ACCESSIBILITY CLAUSE AND PERMISSION TO USE I, Eman Alhasawi, hereby grant to Laurentian University and/or its agents the non-exclusive license to archive and make accessible my thesis, dissertation, or project report in whole or in part in all forms of media, now or for the duration of my copyright ownership. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also reserve the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report. I further agree that permission for copying of this thesis in any manner, in whole or in part, for scholarly purposes may be granted by the professor or professors who supervised my thesis work or, in their absence, by the Head of the Department in which my thesis work was done. It is understood that any copying or publication or use of this thesis or parts thereof for financial gain shall not be allowed without my written permission. It is also understood that this copy is being made available in this form by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner. Abstract Survival time has become an essential outcome of clinical trial, which began to emerge among the latter half of the 20th century. A present study was carried out on the survival analysis for patients with prostate cancer. The data was obtained from Memorial Sloan Kettering where each sample was collected from the recipients of the treatment of radical prostatectomy. The Kaplan- Meier method was used to obtain and estimate the survival function and median time among the primary and metastatic tumor of prostate cancer. Results showed that the metastatic tumor has a poor survival rate compared to the primary tumor, which give us a hint that primary tumor has a higher probability of surviving. The log-rank test was used to test the differences in the survival curves. The results showed that the difference in survival rate between the patients of the two groups of tumor was significant with a p-value of 4.44e-15. The second approach was based on the efficiency of cox proportional hazards model and parametric model. Some criteria of residuals were used for judging the goodness of fit among the candidate models. The cox proportional hazard (PH) model provided an effective covariate on the hazard function. As a result of cox PH model, the influence of standard clinical prognostic factors is based on the hazard rate of prostate cancer patients. These prognostic factors are: prostate specific antigen (PSA) level at diagnosis, tumor size, Secondary Gleason grade, and Gleason score which is helpful to determine the treatment. The Gleason score [HR 4.835, 95% CI 2.7847- 8.3937, p=2.20E-08] has the most significant progression-associated prognosticators and reveal to be an effective criteria leading to death in prostate cancer. The Accelerated Failure Time (AFT) was applied to the data with four distortions. AFT with Weibull distortions was chosen to be the best model for our data by testing the AIC. ii Acknowledgements I would like to thank my God, who got me this far; who blessed me with the right people to help me during the different stages of my study. It gives me great pleasure to express my deepest respect and sincere thanks to my advisor Professor Kalpdrum Passi for his encouragement, valuable suggestions, discussion and guidance throughout my graduate studies. He continually and convincingly conveyed a spirit of adventure in regard to research. He was patient with my writing style and taught me how to explain my thoughts and present them clearly in writing. Without his guidance and persistent help this thesis would not have been possible. I am deeply indebted to my committee Dr. Hafida Boudjellaba who always found time to provide constructive feedback to my thoughts. She provided me with technical support and become more of a mentor friend, than a professor. She answered my detailed oriented questions and helped me progress. I am grateful for her tremendous help at the initial stages of developing my thesis project. I would like to express my regards and thanks to Dr. Mazen Saleh, a member of my supervisory committee for reading my thesis and providing valuable feedback on my thesis. I would like to send my appreciation and respects to Dr. Peter Adamic for his help and suggestions. It is with immense gratitude to thank my family for their love, helps, and supports, especially my parents Ahmed Ali and Anisah Ahmed for being supportive and helping me get all the annoying little things done, my wonderful brother, Ali for supporting me in my pursuit of this degree. I would like to express my gratefulness towards my sisters Azhar and Asia who were always there for me and cheering me on all situations. I am also grateful to all my friends here in Sudbury and my friends in Saudi Arabia for their encouragement and to help change my career path. I couldn’t have achieved this without their help. iii I wish to express my deepest appreciation to the King Abdullah, for giving Saudi women the scholarship to complete studying. I recognize that thesis would not have been possible without the financial assistance of Saudi Cultural Bureau in Canada and the Saudi Ministry of Higher Education. iv This thesis is dedicated to My family and friends, Without whose support and inspiration I would never have had the courage to follow my dreams. I love you and I miss you. v Table of Contents Contents Abstract .................................................................................................................................... ii Acknowledgements ................................................................................................................. iii Table of Contents .................................................................................................................... vi List of figure .......................................................................................................................... viii List of Table ............................................................................................................................ ix List of APPENDIX .................................................................................................................... x Introduction ............................................................................. Error! Bookmark not defined. 1.1 Prostate Cancer .............................................................. Error! Bookmark not defined. 1.1.1 Tumors ................................................................... Error! Bookmark not defined. 1.1.2 Prognostic Factors in Prostate Cancer .................... Error! Bookmark not defined. 1.1.3 Treatment ............................................................... Error! Bookmark not defined. 1.2 Survival Analysis .......................................................... Error! Bookmark not defined. 1.2.1 Censored data ......................................................... Error! Bookmark not defined. 1.2.2 Functions related to survival analysis .................... Error! Bookmark not defined. 1.3 Objectives ...................................................................... Error! Bookmark not defined. Chapter 2 ................................................................................. Error! Bookmark not defined. Literature Review .................................................................... Error! Bookmark not defined. 2.1 Survival Analysis Study ........................................... Error! Bookmark not defined. Vinh-Hung, V. et al. (2002), Post-surgery radiation in early breast cancer: survival analysis of registry data ................................................................ Error! Bookmark not defined. Ray, M.E. et al. (2009), Potential surrogate endpoints for prostate cancer survival: analysis of a phase III randomized trial ........................................ Error! Bookmark not defined. Chan, Y.M. (2013), Statistical Analysis and Modeling of Prostate CancerError! Bookmark not defined. Pulte, D. (2012), Changes in survival by ethnicity of patients with cancer between 1992– 1996 and 2002–2006: is the discrepancy decreasing? ... Error! Bookmark not defined. Chapter 3 ................................................................................. Error! Bookmark not defined. Materials and Methodology .................................................... Error! Bookmark not defined. vi 3.1 The Data source ............................................................. Error! Bookmark not defined. 3.2 Methodology ................................................................. Error! Bookmark not defined. 3.2.1 Non-parametric Methods ............................................ Error! Bookmark not defined. Kaplan Meier Estimates (K-M): ...................................... Error! Bookmark not defined. Log Rank ............................................................................. Error! Bookmark not defined. 3.2.2 Semi-parametric Methods .......................................... Error! Bookmark not defined. Cox proportional hazard: ................................................. Error! Bookmark not defined. The Adequacy of a model: .............................................. Error! Bookmark not defined. Testing the proportional hazards assumption .................. Error! Bookmark not defined. 3.2.3 Parametric Methods .................................................... Error! Bookmark not defined. Accelerated Failure Time Model (AFT): ........................ Error! Bookmark not defined. Chapter 4 ................................................................................. Error! Bookmark not defined. Results and Discussion ............................................................ Error! Bookmark not defined. 4.1 Kaplan-Meier (K-M) Estimation ................................... Error! Bookmark not defined. 4.2 Log-Rank Survival Estimates ........................................ Error! Bookmark not defined. 4.3 Cox Fit Model ............................................................... Error! Bookmark not defined. 4.3.1 Testing the proportional hazards assumption using Schoenfeld’s residuals .. Error! Bookmark not defined. 4.3.2 Evaluating overall model fitting ............................. Error! Bookmark not defined. 4.3.3 Functional Form of Predictors................................ Error! Bookmark not defined. 4.3.4 Checking for Outliers ............................................. Error! Bookmark not defined. 4.4 Output of Accelerated Failure Time (AFT) .................. Error! Bookmark not defined. 4.5 Discussion ..................................................................... Error! Bookmark not defined. Chapter 5 ................................................................................. Error! Bookmark not defined. Conclusion ............................................................................... Error! Bookmark not defined. Future work ............................................................................. Error! Bookmark not defined. References ............................................................................... Error! Bookmark not defined. Appendix ................................................................................ Error! Bookmark not defined. vii List of figure Figure page 1.1 Illustration of left, right and interval censoring (Aaserud,2011)………………….…… 10 1.2 Generally used AFT in survival analysis(Sewalem, 2012).…….……….…………………. 13 1.3 The following steps were providing of analyzing the clinical trial for survival analysis in R……………………………………………………………………………………………..14 3.1 Description illustrated of the clinical data for prostate cancer..…………….……………… 22 4.1 Survival curve of two tumor groups (primary and Met) for the prostate data in table 4.1… 43 4.2 Shows the lines for the prostate cancer data with two types of tumors……….…………… 45 4.3 Survival times of patients with primary tumor according to Gleason grade.…..…... 51 4.4 The Cox proportional hazard PH with error bars show 95% confidence intervals...………………………………….…………………..………….…....………...... 53 4.5 Schoenfeld residuals for each explanatory variable versus transformed time in a model fit ...to the prostate cancer data. ...………...….……………………………………………………..… 55 4.6 Cumulative hazard plot of the Cox-Snell residual for Cox PH model to indicate the overall model………...…….……….……………………………………….………………….….. 56 4.7 Plot of martingale residuals vs. covariates. ………….……………….………………...…. 58 4.8 Deviance residuals consist of information about the influential and outlier data. ……...… 60 4.9 Cumulative hazard plot of the Cox-Snell residual for Weibull AFT model. …………...… 63 viii List of Table Table page 1.1 Four stages of Tumor (University of Maryland). .…………………...........…………..…. 6 1.2 The survival time…………………………………………………………...…...………... 8 1.3 The main model in survival analysis. …………………….…..……...……………...…… 13 3.1 Clinical data of prostate cancer. . …………………….…..……...……..………..….……. 21 Descriptive statistics for the distributions of the variables (Taylor, 2010)………….....… 24 4.1 Initial sorted table for Kaplan- Meier and Log- Rank analysis………………………...… 41 4.2 Calculation for the K-M estimate of the survival function for primary type of tumor ………………………………………………………………..………………… 46 4.3 Calculation for the K-M estimate of the survival function for Met type of tumor…………………………..……..…………………………………………….…….. 47 4.4 Calculation for the log- rank test to compare tumor groups for the data in Table 4.1.........48 4.5 The Cox’s proportional hazards analysis for the prostate cancer patient………..……...... 51 4.6 The hazard rate.……………………………………………………………………....….. 51 4.7 Scaled Schoenfeld Residuals of Significant Covariates on the PH………………………. 54 4.8 Deviance residuals against the risk score…………………………………………..…….. 59 4.9 The log-likelihoods and Akaike Information Criterion (AIC) in the AFT models………. 61 4.10 Results from AFT models for time to progression with Weibull distribution. ………….. 62 ix
Description: