Publications

My research encompasses statistics, optimization, and machine learning. You are welcome to contact me if you are interested in any of my papers. Names of my students are underlined.

Full list also on Google Scholar.

Preprints and papers in progress

[In submission] Coupled Flow Matching
Wenxi Cai, Yuheng Wang, Naichen Shi, 2025. Link.

Highlights

Many nonlinear dimension-reduction methods, such as EigenMaps, t-SNE, UMAP, and VAEs, map high-dimensional data into informative low-dimensional embeddings. But what if we want to explicitly control the distribution of these embeddings?

CPFM

We develop a Coupled Flow Matching framework that unifies optimal transport and generative modeling. It consists of two components: an efficient solver for a generalized form of Gromov-Wasserstein optimal transport, and a dual conditional flow-matching network that learns bidirectional mappings between data and embeddings. Together, they enable mapping complex, high-dimensional data into controllable low-dimensional representations, and generating realistic data samples from them.

QM9

[In submission] It Takes One to Bias Them All: Breaking Bad with One-Shot GRPO
Naihao Deng, Yilun Zhu, Naichen Shi, Clayton Scott, Rada Mihalcea. Link

[In submission] ALBATROSS: Cheap Filtration Based Geometry via Stochastic Sub-Sampling
Andrew J Stier, Naichen Shi, Raed Al Kontar, Chad Giusti, Marc G Berman. Link

[In submission] Heterogeneous Matrix Factorization: When Features Differ by Dataset
Naichen Shi, Salar Fattahi, Raed Kontar. Link

Journal papers

[IEEE T-ASE] Diffusion-Based Surrogate Modeling and Multi-Fidelity Calibration
Naichen Shi, Hao Yan, Shenghan Guo, Raed Kontar. IEEE Transactions on Automation Science and Engineering, 2025. Link, Code.

Highlights

Diffusion generative models can generate photorealistic images and videos but often struggle to understand the physical interactions. When practitioners from the science and engineering fields have access to physics simulators, can they improve the quality of diffusion model-generated samples with the help of simulations?

illustration

Here's a generated video.

We explore two strategies to incorporate physics simulation into diffusion models. Results show that our model indeed integrates physics knowledge in heat and fluid dynamics with patterns from real observations.

[JMLR] Triple Component Matrix Factorization: Untangling Global, Local, and Noisy Components
Naichen Shi, Salar Fattahi, Raed Al Kontar. Journal of Machine Learning Research (JMLR), 2024. Link

[Technometrics] Personalized Tucker Decomposition: Modeling Commonality and Peculiarity on Tensor Data
Jiuyun Hu, Naichen Shi, Raed Kontar, Hao Yan. Technometrics, 2024. Link

[JMLR] Personalized PCA: Decoupling Shared and Unique Features
Naichen Shi, Raed Al Kontar. Journal of Machine Learning Research (JMLR), 2024. Link, Video, Code.

Highlights

When data are collected from multiple related but heterogeneous sources, how can we efficiently integrate the common information among these sources? How to describe and make use of the unique feature in each source? We take advantage of the feature extraction power of principal component analysis (PCA). More specifically, we use global principal components (PCs) to model the common information and local principal components to capture the unique information.

Personalized PCA

Identifiability is a key hurdle as the global PCs can be confounded with local PCs in data. We propose a misalignment condition that measures the "smallest difference" among the subspaces spanned by global PCs. The condition helps us establish an upper bound on the statistical error of the global and local PCs, which almost matches their lower bound. Intriguingly, the results suggest that a higher level of heterogeneity can decrease the statistical error in our method, a benefit of personalization.

Despite the simplicity, Personalized PCA and its derivatives have proven valuable in a variety of fields, including additive manufacturing, solar flare detection.

3D printing Solar flare

An article from Phys.org reports this method.

[JMS] Personalized feature extraction for manufacturing process signature characterization and anomaly detection
Naichen Shi, Shenghan Guo, Raed Al Kontar. Journal of Manufacturing Systems, 2024. Link.

[Technometrics] Personalized Federated Learning via Domain Adaptation with an Application to Distributed 3D Printing
Naichen Shi, Raed Al Kontar. Technometrics, 2023. Link, Video, Code.

[IEEE T-ASE] Fed-ensemble: Ensemble Models in Federated Learning for Improved Generalization and Uncertainty Quantification
Naichen Shi, Raed Al Kontar. IEEE Transactions on Automation Science and Engineering, 2022. Link, Code.

[IEEE Access] The Internet of Federated Things
Raed Kontar, Naichen Shi, Xubo Yue, Seokhyun Chung, Eunshin Byon, Mosharaf Chowdhury, Judy Jin, Wissam Kontar, Neda Masoud, Maher Noueihed, Chinedum E. Okwudire, Garvesh Raskutti, Romesh Saigal, Karandeep Singh, and Zhisheng Ye. IEEE Access, 2021. Link.

Conference papers

[ICML] SURGE: Unbiased Data Assimilation for Diffusion Model via Particle Filtering
Lifu Wei, Yinuo Ren, Naichen Shi, Yiping Lu. Forty-Third International Conference on Machine Learning (ICML), 2026. Link

[AISTATS] Calibrated Principal Component Regression
Yixuan Florence Wu, Yilun Zhu, Lei Cao, Naichen Shi. Twenty-Ninth Annual Conference on Artificial Intelligence and Statistics (AISTATS), 2026. Link

Highlights

When we reduce the dimension of the input data using PCA, we reduce data complexity by retaining only most relevant information. However, using only top PCA embeddings for downstream analytics, such as regression, always brings risks as meaningful information in the remaining PCs could be discarded.

CPCR

We introduce a Calibrated Principal Component Regression model that leverages cross-fitting to restore some information lost in PCA. A risk analysis grounded the random matrix theory reveals the optimal tradeoff between bias and variance.

[NeurIPS] Inv-Entropy: A Fully Probabilistic Framework for Uncertainty Quantification in Language Models
Haoyi Song, Ruihan Ji, Naichen Shi, Fan Lai, Raed Al Kontar. The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025. Link

[NeurIPS Spotlight] Personalized Dictionary Learning for Heterogeneous Datasets
Geyu Liang, Naichen Shi, Raed Al Kontar, Salar Fattahi. Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023. Link, Code.

[MSEC] Process Signature Characterization and Anomaly Detection with Personalized PCA in Laser-Based Metal Additive Manufacturing
Naichen Shi, Raed Kontar, Shenghan Guo. Proceedings of the ASME 2023 18th International Manufacturing Science and Engineering Conference, 2022. Link.

[NeurIPS Spotlight] Adam Can Converge Without Any Modification On Update Rules
Yushun Zhang, Congliang Chen, Naichen Shi, Ruoyu Sun, Zhiquan Luo. Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), 2022. Link.

[ICLR Spotlight] RMSprop converges with proper hyper-parameter
Naichen Shi, Dawei Li, Mingyi Hong, and Ruoyu Sun. International Conference on Learning Representations (ICLR), 2021. Link, Video, Code.

Highlights

Almost every ML/AL practitioner uses adaptive stepsize optimization algorithms (e.g., Adam). Surprisingly, an important theoretical problem was largely unexplored: under what conditions can they converge? We show, both theoretically and numerically, that the good performance of RMSprop and Adam is contingent on the appropriate choice of the exponential averaging parameter β2. Only when β2 is close enough to 1 can (stochastic versions of) Adam and RMSprop generate stable update directions that gradually lead the updates to the optimality.

Adam updates