Publications
My research encompasses statistics, optimization, and machine learning. You are welcome to contact me if you are interested in any of my papers. Names of my students are underlined.
Full list also on Google Scholar.
Preprints and papers in progress
[In submission] Coupled Flow Matching
Wenxi Cai, Yuheng Wang, Naichen Shi, 2025. Link.
Highlights


[In submission] It Takes One to Bias Them All: Breaking Bad with One-Shot GRPO
Naihao Deng, Yilun Zhu, Naichen Shi, Clayton Scott, Rada Mihalcea. Link
[In submission] ALBATROSS: Cheap Filtration Based Geometry via Stochastic Sub-Sampling
Andrew J Stier, Naichen Shi, Raed Al Kontar, Chad Giusti, Marc G Berman. Link
[In submission] Heterogeneous Matrix Factorization: When Features Differ by Dataset
Naichen Shi, Salar Fattahi, Raed Kontar. Link
Journal papers
[IEEE T-ASE] Diffusion-Based Surrogate Modeling and Multi-Fidelity Calibration
Naichen Shi, Hao Yan, Shenghan Guo, Raed Kontar. IEEE Transactions on Automation Science and Engineering, 2025. Link, Code.
Highlights

Here's a generated video.
[JMLR] Triple Component Matrix Factorization: Untangling Global, Local, and Noisy Components
Naichen Shi, Salar Fattahi, Raed Al Kontar. Journal of Machine Learning Research (JMLR), 2024. Link
[Technometrics] Personalized Tucker Decomposition: Modeling Commonality and Peculiarity on Tensor Data
Jiuyun Hu, Naichen Shi, Raed Kontar, Hao Yan. Technometrics, 2024. Link
[JMLR] Personalized PCA: Decoupling Shared and Unique Features
Naichen Shi, Raed Al Kontar. Journal of Machine Learning Research (JMLR), 2024. Link, Video, Code.
Highlights
When data are collected from multiple related but heterogeneous sources, how can we efficiently integrate the common information among these sources? How to describe and make use of the unique feature in each source? We take advantage of the feature extraction power of principal component analysis (PCA). More specifically, we use global principal components (PCs) to model the common information and local principal components to capture the unique information.

Identifiability is a key hurdle as the global PCs can be confounded with local PCs in data. We propose a misalignment condition that measures the "smallest difference" among the subspaces spanned by global PCs. The condition helps us establish an upper bound on the statistical error of the global and local PCs, which almost matches their lower bound. Intriguingly, the results suggest that a higher level of heterogeneity can decrease the statistical error in our method, a benefit of personalization.
Despite the simplicity, Personalized PCA and its derivatives have proven valuable in a variety of fields, including additive manufacturing, solar flare detection.

An article from Phys.org reports this method.
[JMS] Personalized feature extraction for manufacturing process signature characterization and anomaly detection
Naichen Shi, Shenghan Guo, Raed Al Kontar. Journal of Manufacturing Systems, 2024. Link.
[Technometrics] Personalized Federated Learning via Domain Adaptation with an Application to Distributed 3D Printing
Naichen Shi, Raed Al Kontar. Technometrics, 2023. Link, Video, Code.
[IEEE T-ASE] Fed-ensemble: Ensemble Models in Federated Learning for Improved Generalization and Uncertainty Quantification
Naichen Shi, Raed Al Kontar. IEEE Transactions on Automation Science and Engineering, 2022. Link, Code.
[IEEE Access] The Internet of Federated Things
Raed Kontar, Naichen Shi, Xubo Yue, Seokhyun Chung, Eunshin Byon, Mosharaf Chowdhury, Judy Jin, Wissam Kontar, Neda Masoud, Maher Noueihed, Chinedum E. Okwudire, Garvesh Raskutti, Romesh Saigal, Karandeep Singh, and Zhisheng Ye. IEEE Access, 2021. Link.
Conference papers
[ICML] SURGE: Unbiased Data Assimilation for Diffusion Model via Particle Filtering
Lifu Wei, Yinuo Ren, Naichen Shi, Yiping Lu. Forty-Third International Conference on Machine Learning (ICML), 2026. Link
[AISTATS] Calibrated Principal Component Regression
Yixuan Florence Wu, Yilun Zhu, Lei Cao, Naichen Shi. Twenty-Ninth Annual Conference on Artificial Intelligence and Statistics (AISTATS), 2026. Link
Highlights

[NeurIPS] Inv-Entropy: A Fully Probabilistic Framework for Uncertainty Quantification in Language Models
Haoyi Song, Ruihan Ji, Naichen Shi, Fan Lai, Raed Al Kontar. The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025. Link
[NeurIPS Spotlight] Personalized Dictionary Learning for Heterogeneous Datasets
Geyu Liang, Naichen Shi, Raed Al Kontar, Salar Fattahi. Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023. Link, Code.
[MSEC] Process Signature Characterization and Anomaly Detection with Personalized PCA in Laser-Based Metal Additive Manufacturing
Naichen Shi, Raed Kontar, Shenghan Guo. Proceedings of the ASME 2023 18th International Manufacturing Science and Engineering Conference, 2022. Link.
[NeurIPS Spotlight] Adam Can Converge Without Any Modification On Update Rules
Yushun Zhang, Congliang Chen, Naichen Shi, Ruoyu Sun, Zhiquan Luo. Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), 2022. Link.
[ICLR Spotlight] RMSprop converges with proper hyper-parameter
Naichen Shi, Dawei Li, Mingyi Hong, and Ruoyu Sun. International Conference on Learning Representations (ICLR), 2021. Link, Video, Code.
Highlights

