Skip to Content

Department of Statistics

Colloquia

The colloquia listed here are presented by visiting academic researchers, members of the business community, as well by USC faculty and graduate students. The research topics introduced by the speakers delve into all areas of statistics.

Faculty, students, and off-campus visitors are invited to attend any of our colloquia and Palmetto Lecture Series.

2025 – 2026 Department of Statistics Colloquium Speaer

When: Thursday, September 4, 2025—2:50 p.m. to 3:50 p.m.
Where: LeConte 224

Speaker: Dr. Anru Zhang, Department of Biostatistics & Bioinformatics and Department of Computer Science, Duke University

Abstract: The increasing availability of electronic health records (EHRs) and other biomedical data calls for methodologies that can generate high-quality synthetic data while preserving privacy, correcting bias, and addressing complex data structures. In this talk, I will present a series of recent advances in generative modeling for synthetic health data. First, using denoising diffusion probabilistic models, we develop a framework for generating realistic, privacy-preserving EHR time series that achieve superior fidelity and lower privacy risk than existing methods. Second, to address irregularly observed functional data, we introduce Smooth Flow Matching (SFM), a semiparametric copula flow framework capable of generating smooth, infinite-dimensional trajectories under irregular sampling and non-Gaussian structures. Finally, we propose a bias-corrected data synthesis strategy for imbalanced learning, which mitigates distortions introduced by synthetic samples and enhances predictive performance in rare-event classification. Collectively, these methods provide a principled foundation for generative modeling of synthetic health data, enabling privacy-preserving bias-reduced analysis and broader utilization of sensitive biomedical datasets.

When: Thursday, September 4, 2025—2:50 p.m. to 3:50 p.m.
Where: LeConte 224

Speaker: Dr. Cong Ma, Department of Statistics, University of Chicago

Abstract: Integrative data analysis often requires separating shared from individual variations across multiple datasets, typically using the Joint and Individual Variation Explained (JIVE) model. Despite its popularity, theoretical insights into JIVE methods remain limited, particularly in the context of multiple matrices and varying degrees of subspace misalignment. In this talk, I will present new theoretical results on the Angle-based JIVE (AJIVE) method—a two-stage spectral algorithm. Specifically, we establish that AJIVE achieves decreasing estimation error with an increasing number of matrices in high signal-to-noise ratio (SNR) regimes. In contrast, AJIVE faces inherent limitations in low-SNR conditions, where estimation error remains persistently high. Complementary minimax lower bounds confirm AJIVE’s optimal performance at high SNR, while analysis of an oracle estimator highlights fundamental limitations of spectral methods at low SNR. 

When: Thursday, September 18, 2025—2:50 p.m. to 3:50 p.m.
Where: LeConte 224

Speaker: Dr. Christopher Wikle, Department of Statistics, University of Missouri

Abstract: The world is full of extreme events. For example, a central question in public health planning might be to assess the likelihood of extreme exposures (meteorological conditions, air pollution, social stress, etc.). Such extreme events typically occur in spatial and/or temporal clusters. Yet, the principal methodologies that statisticians deal with spatially dependent processes (Gaussian processes and Markov random fields) are not suitable for complex tail dependence structures. This is particularly true of simulation model emulation. More flexible spatial extremes models exhibit appealing extremal dependence properties but are often exceedingly prohibitive to fit and simulate from in high dimensions. Here I present recent work where we develop a new spatial extremes model that has flexible and non-stationary dependence properties, and we integrate it in the encoding-decoding structure of a variational autoencoder (XVAE), whose parameters are estimated via variational Bayes combined with deep learning. The XVAE can be used to analyze high-dimensional data or as a spatio-temporal emulator that characterizes the distribution of potential mechanistic model output states and produces outputs that have the same statistical properties as the inputs, especially in the tail. Through extensive simulation studies, we show that our XVAE is substantially more time-efficient than traditional Bayesian inference while also outperforming many spatial extremes models with a stationary dependence structure. We demonstrate our method applied to a high-resolution satellite-derived dataset of sea surface temperature in the Red Sea and to a high-resolution simulation model of a turbulent plume, such as one would find in a wildfire. We note, however, that these methods can be applied to any data set or simulation model that exhibits extremes.

When: Thursday, September 25, 2025—2:50 p.m. to 3:50 p.m.
Where: LeConte 224

Speaker: Dr. Seungchul Baek, Department of Mathematics and Statistics, University of Maryland, Baltimore County

Abstract: I introduce two projects related to high-dimensional classification. The first project focuses on developing a classifier using random partitioning. Specifically, we split the original high-dimensional data ($p>n$) into multiple low-dimensional subsets, making sure the number of selected covariates is less than the sample size. Using these partitioned datasets, we apply linear discriminant analysis (LDA) to each subset and propose a method to aggregate the results. We provide theoretical justification for our approach by comparing its misclassification rates to those of LDA in high dimensions. The second project concerns variable selection in high-dimensional classification. By utilizing the recently proposed mirror statistic, we first identify significant variables and then develop a new classifier based on a modified version of the $\epsilon$-greedy algorithm.

When: Tuesday, October 14, 2025—2:50 p.m. to 3:50 p.m.
Where: LeConte 224

Speaker: Dr. Philip Ernst, Department of Mathematics, Imperial College London

Abstract: TBD

When: Thursday, October 16, 2025—2:50 p.m. to 3:50 p.m.
Where: LeConte 224

Speaker: Dr. Jason Klusowski, Department of Operations Research and Financial Engineering, Princeton University

Abstract: TBD

 

When: Thursday, October 30, 2025—2:50 p.m. to 3:50 p.m.
Where: LeConte 224

Speaker: Dr. Tingting Zhang, Department of Statistics, University of Pittsburgh

Abstract:

The human brain is a high-dimensional directed network system of brain regions involving directed connectivity. Seizures are a directed network phenomenon, as abnormal neuronal activities start from a seizure onset zone (SOZ) and propagate to otherwise healthy regions. To localize the SOZ of an epileptic patient, clinicians use intracranial EEG (iEEG) to record the patient’s brain activity in many small regions. iEEG data are high-dimensional multivariate time series.
To model the underlying directed brain network, we build a state-space multivariate autoregression (SSMAR) model for iEEG data. To produce scientifically meaningful network results, we incorporate prior knowledge that brain networks tend to exhibit modular organization. Specifically, we assign a stochastic-blockmodel-motivated prior to the SSMAR parameters, which encourages modularity in the estimated networks.
We develop a Bayesian framework to estimate the SSMAR model, infer directed connections, and identify network modules. The method is robust to violations of model assumptions and outperforms existing network approaches. When applied to iEEG data from an epileptic patient, the model reveals patterns of seizure initiation and propagation and uncovers a distinct connectivity profile of the SOZ. We also extend this Bayesian approach to fMRI data, identifying functionally specialized modules and directed interactions between them.

When: Thursday, November 6, 2025—2:50 p.m. to 3:50 p.m.
Where: LeConte 224

Speaker: Dr. Nathaniel Josephs, Department of Statistics, North Carolina State University

Abstract: TBD

When: Thursday, November 13, 2025—2:50 p.m. to 3:50 p.m.
Where: LeConte 224

Speaker: Dr. Yichao Wu, Department of Mathematics, Statistics, and Computer Science, University of Illinois Chicago

Abstract: The first part of the talk will focus on the general partially linear model without any structure assumption on the nonparametric component. For such a model with both linear and nonlinear predictors being multivariate, we propose a new variable selection method. Our new method is a unified approach in the sense that it can select both linear and nonlinear predictors simultaneously by solving a single optimization problem. We prove that the proposed method achieves consistency.The second part of the talk will be based on an ongoing research project. In this project, we are extending the above variable selection method to partially global Fréchet regression (Tucker and Wu, 2025 Statistica Sinica).

 

Past colloquium talks are archived here.


Challenge the conventional. Create the exceptional. No Limits.

©