The College of Public Health is hosting a candidate talk entitled “Spatiotemporal Modeling with Applications to Stroke Mortality and Data Privacy” by Harrison S. Quick, PhD, of the National Center for Chronic Disease Prevention, Division of Heart and Stroke Prevention, Center for Disease Control and Prevention and Health Promotion.
Abstract: In this talk, I will discuss how statistical methods from the field of disease mapping can be used in the area of data privacy. The motivating dataset for this work consists of county-level stroke death counts from 1973-2013 across multiple age groups. As stroke mortality is relatively uncommon—affecting less than 100 individuals ages 65-74 per 100,000—these data are plagued with low counts. As such, there are two primary objectives for this talk. First and foremost, we wish to identify and summarize spatiotemporal trends in stroke mortality across these age groups. This will require the development of flexible hierarchical models which account for not only spatiotemporal associations, but also the correlation between age groups to achieve reliable rate estimates. The second objective pertains to the release of public-use data and data privacy. Per the guidelines of the National Center of Health Statistics, death counts less than 10 should be suppressed to protect the confidentiality of data-subjects. For these data, however, this criterion results in nearly 70% of the over 380,000 data points being suppressed. Because such high suppression rates can significantly reduce the utility of the publicly available data, the secondary goal of this work is to generate so-called “synthetic data” that preserve the complex dependence structure of the original data while avoiding the disclosure risks associated with releasing unsuppressed confidential data.