Publications by William J. Welch
2018
Toxic Colors: The Use of Deep Learning for Predicting Toxicity of Compounds Merely from Their Graphic Images. Journal of Chemical Information and Modeling. 2018;(in press). .
Bayesian Optimization Using Monotonicity Information and Its Application in Machine Learning Hyperparameter Tuning. In Proceedings of AutoML 2018 @ ICML/IJCAI-ECAI [Internet]. 2018. URL: https://sites.google.com/site/automl2018icml/accepted-papers/59.pdf .
2017
Discussion of Random-Projection Ensemble Classification by T. I. Cannings and R. J. Samworth. Journal of the Royal Statistical Society B [Internet]. 2017; 79: 1024-1025. DOI: 10.1111/rssb.12228 URL: http://dx.doi.org/10.1111/rssb.12228 .
Flexible Correlation Structure for Accurate Prediction and Uncertainty Quantification in Bayesian Gaussian Process Emulation of a Computer Model. SIAM/ASA Journal on Uncertainty Quantification [Internet]. 2017; 5: 598–620. DOI: https://doi.org/10.1137/15M1008774 URL: https://doi.org/10.1137/15M1008774 .
2016
Analysis Methods for Computer Experiments: How to Assess and What Counts?. Statistical Science [Internet]. 2016; 31: 40–60. DOI: https://dx.doi.org/10.14288/1.0302078 URL: https://dx.doi.org/10.14288/1.0302078 .
Comment: Expected Improvement for Efficient Blackbox Constrained Optimization. Technometrics [Internet]. 2016; 58: 12–15. URL: http://www.tandfonline.com/doi/full/10.1080/00401706.2015.1044119 .
Exploiting Multiple Descriptor Sets in QSAR Studies. Journal of chemical information and modeling [Internet]. 2016; 56: 501–509. URL: http://pubs.acs.org/doi/abs/10.1021/acs.jcim.5b00663 .
Using a Gaussian Process as a Nonparametric Regression Model. Quality and Reliability Engineering International. 2016; 32: 673–680. .
2015
Ensembling classification models based on phalanxes of variables with applications in drug discovery. The Annals of Applied Statistics [Internet]. 2015; 9: 69–93. URL: http://projecteuclid.org/euclid.aoas/1430226085 .
2014
Air Quality Model Evaluation Using Gaussian Process Modelling and Empirical Orthogonal Function Decomposition. In Air Pollution Modeling and its Application XXIII [Internet]. Springer International Publishing; 2014. pp. 457–462. URL: http://link.springer.com/chapter/10.1007/978-3-319-04379-1_75 .
Design of Computer Experiments for Optimization, Estimation of Function Contours, and Related Objectives. Statistics in Action: A Canadian Outlook [Internet]. 2014; 109. URL: http://books.google.com/books?hl=en&lr=&id=0GbvAgAAQBAJ&oi=fnd&pg=PA109&dq=info:6iJ547N97RwJ:scholar.google.com&ots=ZsI6ZcqMhr&sig=xk0x1HgmPBsC7dJovZQoMtlGCJQ .
2013
Analysis Methods for Computer Experiments: How to Assess and What Counts? [Internet]. Tech. rep., University of British Columbia; 2013. URL: http://www.stat.ubc.ca/ will/docs/whatcounts.pdf .
2012
Harvesting Classification Trees for Drug Discovery. Journal of chemical information and modeling [Internet]. 2012; 52: 3169–3180. URL: http://pubs.acs.org/doi/abs/10.1021/ci3000216 .
2011
ChemModLab: A web-based cheminformatics modeling laboratory. In silico biology [Internet]. 2011; 11: 61–81. URL: http://content.iospress.com/articles/in-silico-biology/ci000016 .
Efficient, adaptive cross-validation for tuning and comparing models, with application to drug discovery. The Annals of Applied Statistics [Internet]. 2011;: 2668–2687. URL: http://www.jstor.org/stable/23069346 .
2010
Model-based linear clustering. Canadian Journal of Statistics [Internet]. 2010; 38: 716–737. URL: http://onlinelibrary.wiley.com/doi/10.1002/cjs.10082/full .
2009
Choosing the Sample Size of a Computer Experiment: A Practical Guide. Technometrics [Internet]. 2009; 51: 366-376. DOI: 10.1198/TECH.2009.08040 URL: http://dx.doi.org/10.1198/TECH.2009.08040 .
2007
Exploration of cluster structure-activity relationship analysis in efficient high-throughput screening. Journal of chemical information and modeling [Internet]. 2007; 47: 1206–1214. URL: http://pubs.acs.org/doi/abs/10.1021/ci600458n .
Fast Bayesian inference for Gaussian process models [Internet]. Technical Report 230, Dept. Statistics, Univ. British Columbia; 2007. URL: http://be.stat.ubc.ca/Research/TechReports/techreports/230.pdf .
Correlation parameterization in random function models to improve normal approximation of the likelihood or posterior. Dept. of Statistics, The University of British Columbia, URL http://stat. ubc. ca/Research/TechReports/techreports/229. pdf, Tech. Rep [Internet]. 2007; 229. URL: http://ftp.stat.ubc.ca/Research/TechReports/techreports/229.pdf .
2006
Dynamic variable selection in SNP genotype autocalling from APEX microarray data. BMC bioinformatics [Internet]. 2006; 7: 1. URL: http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-7-521 .
Screening the input variables to a computer model via analysis of variance and visualization. In Screening [Internet]. Springer New York; 2006. pp. 308–327. URL: http://link.springer.com/chapter/10.1007/0-387-28014-6_14 .
Computer model calibration or tuning in practice. Technometrics, submitted for publication [Internet]. 2006;. URL: https://www.stat.ubc.ca/Research/TechReports/techreports/221.pdf .
2004
Comparison of methods based on diversity and similarity for molecule selection and the analysis of drug discovery data. In Chemoinformatics [Internet]. Springer; 2004. pp. 301–315. URL: http://link.springer.com/protocol/10.1385/1-59259-802-1:301 .
2003
Classification for Ranking in Drug Discovery: Identifying and Aggregating Relevant Subsets of Variables. Proceedings of the ISI Conference on Environmental Statistics and Health, Santiago de Compostela, Spain [Internet]. 2003;: 173–181. URL: http://books.google.com/books?hl=en&lr=&id=xBll4aQqxKAC&oi=fnd&pg=PA173&dq=info:yaGcD_7a4fgJ:scholar.google.com&ots=uvxgHFyu46&sig=PaTT5DK4PYIZjdia_dlDMsOQCwI .
2002
Mining nuggets of activity in high dimensional space from high throughput screening data. University of Waterloo URL=http://www.bisrg.uwaterloo.ca/archive/RR-02-01.pdf; 2002. .
Initial compound selection for sequential screening. Current Opinion in Drug Discovery and Development [Internet]. 2002; 5: 422–427. URL: http://genomics10.bu.edu/megonw/pdf/active_learning/sequential_screening_young.pdf .
Uniform Coverage Designs for Molecule Selection. Technometrics [Internet]. [Taylor & Francis, Ltd., American Statistical Association, American Society for Quality]; 2002; 44: 99-109. URL: http://www.jstor.org/stable/1271254 .
Design and analysis of computer experiments when the output is highly correlated over the input space. Canadian Journal of Statistics [Internet]. 2002; 30: 109–126. URL: http://onlinelibrary.wiley.com/doi/10.2307/3315868/abstract .
Mining nuggets of activity in high dimensional space from high throughput screening data [Internet]. Technical report, IIQP; 2002. URL: https://www.researchgate.net/profile/William_Welch/publication/246834527_Mining_nuggets_of_activity_in_high_dimensional_space_from_high_throughput_screening_data/links/0a85e53c28a25cb5a6000000.pdf .
2001
Statistical methods for deterministic biomathematical models. Proceedings of the 53rd session of the international statistical institute, Seoul, Korea [Internet]. 2001;. URL: http://scholar.google.com/scholar?cluster=15064797966098008106&hl=en&oi=scholarr .
1999
Sensitivity analysis of computer models: World Bank HDM-III model. Journal of Transportation Engineering [Internet]. 1999; 125: 421–428. URL: http://ascelibrary.org/doi/abs/10.1061/(ASCE)0733-947X(1999)125:5(421) .
Design and analysis for modeling and predicting spatial contamination. Mathematical Geology [Internet]. 1999; 31: 1–22. URL: http://link.springer.com/article/10.1023/A:1007504329298 .
Analysis of protein activity data by Gaussian stochastic process models. Journal of Biopharmaceutical Statistics [Internet]. 1999; 9: 145–160. URL: http://www.tandfonline.com/doi/abs/10.1081/BIP-100101005 .
1998
Circuit optimization via sequential computer experiments: design of an output buffer. Journal of the Royal Statistical Society: Series C (Applied Statistics) [Internet]. 1998; 47: 31–48. URL: http://onlinelibrary.wiley.com/doi/10.1111/1467-9876.00096/abstract .
Efficient global optimization of expensive black-box functions. Journal of Global optimization [Internet]. 1998; 13: 455–492. URL: http://link.springer.com/article/10.1023/A:1008306431147 .
Fisher information and maximum-likelihood estimation of covariance parameters in Gaussian stochastic processes. Canadian Journal of Statistics [Internet]. 1998; 26: 127–137. URL: http://onlinelibrary.wiley.com/doi/10.2307/3315678/abstract .
Global versus local search in constrained optimization of computer models. Lecture Notes-Monograph Series [Internet]. 1998;: 11–25. URL: http://www.jstor.org/stable/4356058 .
1997
Efficient experimental design strategy for numerical ocean modelling. In Canadian Meteorological and Oceanographic Society Bulletin. 1997. pp. 95–102. .
1996
Predicting urban ozone levels and trends with semiparametric modeling. Journal of Agricultural, Biological, and Environmental Statistics [Internet]. 1996;: 404–425. URL: http://www.jstor.org/stable/1400436 .
Robust design for censored exponential data [Internet]. University of Waterloo; 1996. URL: http://www.bisrg.uwaterloo.ca/archive/RR-96-07.pdf .
Response to James M. Lucas. Technometrics [Internet]. 1996; 38: 199–203. URL: http://www.tandfonline.com/doi/pdf/10.1080/00401706.1996.10484496 .
1994
Parameter space exploration of an ocean general circulation model using an isopycnal mixing parameterization. Journal of Marine Research [Internet]. 1994; 52: 773–796. URL: http://www.ingentaconnect.com/content/jmr/jmr/1994/00000052/00000005/art00001 .
Criterion-robust optimal design. University of Waterloo; 1994. .
Correcting for covariates in permutation tests. University of Waterloo; 1994. .
Arctic sea ice variability: Model sensitivities and a multidecadal simulation. Journal of Geophysical Research: Oceans [Internet]. 1994; 99: 919–935. URL: http://onlinelibrary.wiley.com/doi/10.1029/93JC02564/full .
1993
Discussion of the paper «The foundation of experimental design and observation» by HP Wynn. Statistical Methods & Applications [Internet]. 1993; 2: 181–181. URL: http://www.springerlink.com/index/FG52588L05XQH674.pdf .
1992
Screening, predicting, and computer experiments. Technometrics [Internet]. 1992; 34: 15–25. URL: http://www.tandfonline.com/doi/abs/10.1080/00401706.1992.10485229 .
Taguchi's parameter design: a panel discussion. Technometrics [Internet]. 1992; 34: 127–161. URL: http://amstat.tandfonline.com/doi/abs/10.1080/00401706.1992.10484904 .
Integrated circuit design optimization using a sequential strategy. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on [Internet]. 1992; 11: 361–372. URL: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=124423 .