Transfer Learning and Multi-task Learning
Statistical learning theory and methods for transfer and multi-task learning, including high-dimensional, unsupervised, federated, and privacy-preserving settings.
Overview
This page summarizes several of my recent works on Transfer Learning (TL) and Multi-task Learning (MTL), focusing on theoretical guarantees, algorithmic design, and applications to high-dimensional, unsupervised, and federated settings.
These works aim to answer a common question:
How can we safely and efficiently borrow information across related tasks or domainsโwhile avoiding negative transfer and preserving privacy?
๐งฎ Transfer Learning under High-Dimensional Generalized Linear Models
Ye Tian & Yang Feng, Journal of the American Statistical Association (JASA), 2023
๐ Paperโ|โ๐ป R Package: glmtrans
Summary
This work develops a principled framework for transfer learning in high-dimensional GLMs, where multiple related source datasets can inform a target task. It proposes a two-step TransGLM procedure to automatically determine which sources are informative, mitigating negative transfer.
Highlights
- Introduces a transferable source detection mechanism to adaptively select useful sources.
- Provides non-asymptotic bounds for estimation and prediction errors.
- Establishes theoretical conditions under which transfer learning yields provable benefits.
- Includes a method for valid post-transfer inference (confidence intervals).
๐งฉ Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models
Ye Tian, Haolei Weng, Lucy Xia, & Yang Feng, arXiv preprint, 2024
๐ Paper โ|โ๐ป R Package: mtlgmm
Summary
This paper studies unsupervised MTL/TL for Gaussian Mixture Models (GMMs), proposing a robust and theoretically grounded approach that jointly learns across multiple mixture tasks.
Highlights
- A multi-task EM algorithm that learns shared latent structures across tasks.
- Robustness against outlier tasks with arbitrary distributions.
- Finite-sample guarantees and minimax-optimal estimation rates.
- Theoretical resolution of label-alignment issues across tasks.
๐ Towards the Theory of Unsupervised Federated Learning
Ye Tian, Haolei Weng, & Yang Feng, ICML 2024
๐ Paper | ๐ป Python package: FedGrEM
Summary
This paper provides one of the first non-asymptotic analyses of federated EM algorithms for unsupervised learning. It bridges the gap between theory and practice in federated expectation-maximization across heterogeneous clients.
Highlights
- Finite-sample convergence guarantees for federated EM under heterogeneity.
- Addresses label alignment and initialization issues unique to unsupervised federated setups.
- Empirically validated across distributed mixture models.
๐ง Learning from Similar Linear Representations: Adaptivity, Minimaxity, and Robustness
Ye Tian, Yuqi Gu, & Yang Feng, Journal of Machine Learning Research (JMLR), 2025
๐ Paper | ๐ป Python package: RL-MTL-TL
Summary
This work provides a general theory for learning from similar but not identical linear representations. It develops adaptive algorithms that automatically determine how much to share across tasks, achieving robustness and minimax optimality.
Highlights
- Introduces an adaptive penalized ERM framework for shared representations.
- Characterizes regimes of beneficial transfer vs negative transfer.
- Achieves minimax rates and adapts seamlessly to unknown similarity levels.
- Robust to outlier tasks and distributional shifts.
๐งญ Geometry-Aware Multi-Task Representation Learning
Aoran Chen & Yang Feng, arXiv preprint, 2025
๐ Paper | ๐ป Python package: GeoERM
Summary
This paper introduces GeoERM, a geometry-aware multi-task learning (MTL) framework that respects the intrinsic Riemannian geometry of the representation space. Rather than treating the shared representation as a point in Euclidean space, the method performs optimization on the appropriate manifold, combining:
- Riemannian gradient steps that align with the curvature of the search space,
- Polar retraction to ensure that updates remain on the manifold.
GeoERM incurs similar per-iteration computational cost to Euclidean baselines but offers enhanced robustness under task heterogeneity and adversarial label noise. Experiments on synthetic and real datasets (e.g. wearable-sensor activity recognition) show improved estimation accuracy and reduced negative transfer.
Highlights
- Embeds the latent representation on its natural Riemannian manifold and operates via manifold-aware updates.
- Demonstrates resilience to heterogeneity and adversarial noise.
- Retains computational efficiency comparable to standard Euclidean MTL methods.
- Empirically outperforms leading MTL and single-task baselines.
๐ Federated Transfer Learning with Differential Privacy
Mengchu Li, Ye Tian, Yang Feng, & Yi Yu, arXiv preprint, 2024
๐ Paper
Summary
This study integrates federated transfer learning with differential privacy (DP) to enable knowledge transfer across domains without compromising user data confidentiality.
Highlights
- A noise-calibrated federated transfer algorithm with formal DP guarantees.
- Tight theoretical characterization of privacyโutility trade-offs.
- Demonstrates practical feasibility for privacy-sensitive health and financial data.
If you would like to learn more about my work or explore code, slides, and related materials, please check my full list of publications on the Publications page or contact me.