TDAI affiliate Dr. Xiaoxuan Cai, Assistant Professor of Statistics, will give a talk entitled "State space model multiple imputation for missing data in non-stationary multivariate time series with application in digital Psychiatry" for the Foundations of Data Science & AI community of practice, on Monday, February 27, at 3:00 p.m.
The event will be hybrid. Please let us know whether you plan to join us in person [Register for in-person] or attend remotely via Zoom [Register for Zoom link].
Mobile technology (e.g., mobile phones and wearable devices) provides effective and scalable methods for collecting physiological and behavioral biomarkers in patients’ naturalistic settings, as well as opportunity for therapeutic advancements and scientific discoveries regarding the eti- ology of psychiatric illness. Continuous data collection yields a new type of data: entangled multivariate time series of outcome, exposure, and covariates. Missing data is a pervasive prob- lem in biomedical and social science research, and Ecological Momentary Assessment (EMA) in psychiatric research via mobile devices is no exception. However, complex data structure of multivariate time series and non-stationarity make missing data a major challenge for proper inference. Time series analyses typically include history information as explanatory variables to control for auto-correlation, exacerbating the missing data problem and potentially render- ing unfeasible to adjust appropriately for confounding. The majority of available imputation methods are either designed for longitudinal data with limited follow-up times or for stationary time series. Limited work on non-stationary time series either focuses on missing exogenous information or ignores the complex relationship among outcome, exposure and covariates time series. How to handle missing data in complex non-stationary multivariate time series is a key problem that remains unresolved, and the performance of existing imputation methods remains to be evaluated in the context of non-stationary mobile device data. We propose a novel data imputation solution based on the state space model and multiple imputation to properly address missing data in non-stationary multivariate time series. We demonstrate its advantages over other widely used missing data imputation strategies by evaluating its theoretical properties and empirical performance in simulations of both stationary and non-stationary time series, subject to various missing mechanisms. We apply the proposed method to investigate the association between digital social interaction and negative mood in a multi-year smartphone observational study of bipolar and schizophrenia patients.