Jean-Eudes J. Dazard, PhD
Assistant Professor, Center for Proteomics and Bioinformatics–Bioinformatics Divisionjxd101@case.edu 216.368.4784 (o) 216.368.6846 (f)
Member, Nonprogrammatically Aligned
Conventional statistical techniques and methods literally fall apart or are inappropriate at best when dealing with modern large datasets where the number of variables greatly exceeds the number of observations (so-called p >> n paradigm). It is a hard problem with several statistical issues causing potential risks of severe errors and model unfitting. Particular challenges posed by high dimensional data are the multiplicity of inferences and the control of error rates, the multi-collinearity of predictors due to the parallel nature of the variables, and finally the sparsity due to inherent noise from the employed technologies and the fewness of variables at play compared to the massive number of variables interrogated.
My research interest is in computational/statistical biology with emphasis on developing data mining methods in high-dimensional settings. Data are mostly from high-throughput or "omics" data as generated by microarray, proteomics and high-throughput sequencing technologies. My recent focus has been in:
- Bump hunting problems in Classification, Regression, and Survival settings. General applications are in developing risk and reliability analysis tools, as well as clinical applications in diagnostic and prognostic tools for personalized medicine.
- Bayesian Model Selection and Predictive Modeling as applied to Differential Expression and Genetic Interaction problems. Recent applications were made in genetic association studies, biomarker discovery, and proteomics interaction problems.
- Regularization and Variance Stabilization of high-dimensional data.
- Statistical Computing: Monte-Carlo methods. Parallel computing. Computational complexity (algorithmic and memory) of large datasets.