Wang, Kurtek to explore data structures, shapes, and dynamics

TDAI affiliates Yusu Wang and Sebastian Kurtek are co-principal investigators on a new interdisciplinary project that has been awarded $500,000 through Phase 1 of the NSF’s Transdisciplinary Research in Principles of Data Science (TRIPODS). The project, entitled “Topology, Geometry, and Data Analysis (TGDA@OSU): Discovering Structure, Shape, and Dynamics in Data,” is led by PI Tamal Dey (computer science and engineering), along with co-PIs David Sivakoff (statistics, math) and Facundo Memoli (computer science and engineering, math).

Abstract:

This project will advance the methodological and theoretical foundations of data analytics by considering the geometric and topological aspects of complex data from mathematical, statistical and algorithmic perspectives, thus enhancing the synergy between the Computer Science, Mathematics, and Statistics communities. Furthermore, this project will benefit a range of impactful scientific areas including medicine, neuronanatomy, machine learning, geographic information systems, mechanical engineering designs, and political science. The research products will be implemented and disseminated through software packages and tutorials, allowing widespread application by industrial and academic practitioners. Through this project, the PIs will develop curricula for cross-disciplinary, undergraduate and graduate education. There is already extant data science curriculum offered jointly between Statistics and Computer Science and Engineering at The Ohio State University, including the recent Data Analytics undergraduate major, providing a platform to develop new courses and an opportunity to engage future industry leaders in basic research. Additionally, this project aims to develop partnerships with the Translational Data Analytics and the Mathematical Biosciences Institutes at OSU, as well as other internal and external research and education centers. Plans for workshops and summer schools are included for outreach and training purposes.

In the past few decades, a large number of models, methods, and algorithmic frameworks have been developed for data science. However, as data become increasingly more complex, the field faces new challenges. In particular, the non-Euclidean nature, the higher order connectivity, the hidden global cues, and the dynamics regulating the data pose further challenges to existing methods. This project will explore and leverage the geometric and topological structures inherent in the data to tackle some of these problems. The main aims are to discover, model and reveal information in the form of (i) structures in data, (ii) shapes from data, and (iii) dynamics underlying data. This project leverages concepts from mathematical areas of differential and algebraic topology and geometry, applied statistics and combinatorics, and computational areas of algorithms, graph theory, and statistical/machine learning. Research in geometric and topological data analysis has brought forth the need to recast and reinvestigate classical concepts in statistics and mathematics in the context of finite data, approximations, and noise. This project investigates explicit or hidden structures behind data, such as cluster trees, which are the basis for understanding and efficient processing of data. Additionally, the PIs aim to model the precise shape behind data globally or locally, which are essential for providing a platform where various statistical analyses can be carried out. Particular examples include the shape space of surface models and the tree space of phylogenetic trees. Finally, this project will consider dynamics in the data, where the interplay between temporal and topological/geometric features can lead to deeper insights. All of these areas will inevitably be enriched by new applications.

Share this page
Suggested Articles
Big Data for Good: Fighting the flies

In the latest Big Data for Good feature, TDA affiliate Laura Kubatko, professor of statistics and evolution, ecology, and organismal biology, discusses her work using statistical techniques to protect the...

Call for nominations- The President's Postdoctoral Scholars Program

Form of Intent Due:  November 2, 2020 Final Nominations Due: December 1, 2020   The Office of Postdoctoral Affairs is pleased to announce the fourth President’s Postdoctoral Scholars Program (PPSP) competition. This...

TDAI co-leading first NSF workshop on translational data science

TDAI and the University of Chicago’s Center for Data Intensive Science are co-chairing an NSF-sponsored Translational Data Science Workshop June 26-27 in Chicago. The invitation-only event is designed to build...

Invaluable collaborations with Fujitsu Labs

Fujitsu Laboratories of America helped advance TDAI’s twin priorities of workforce development and data science research on an individual level when it hosted two Ohio State computer scientists earlier this...

NASA requests inputs to Science Mission Directorate Strategic Plan for Scientific Data and Computing

The NASA Science Mission Directorate has issued a Request for Information for Inputs to the Science Mission Directorate Strategic Plan for Scientific Data and Computing. Solicitation Number: NNH18ZDA017L Release Date: September...