Transfer Learning and Multi-task Learning

Statistical learning theory and methods for transfer and multi-task learning, including high-dimensional, unsupervised, federated, and privacy-preserving settings.

Overview

This page summarizes several of my recent works on Transfer Learning (TL) and Multi-task Learning (MTL), focusing on theoretical guarantees, algorithmic design, and applications to high-dimensional, unsupervised, and federated settings.
These works aim to answer a common question:

How can we safely and efficiently borrow information across related tasks or domainsโ€”while avoiding negative transfer and preserving privacy?

Transfer Learning and Multi-task Learning


๐Ÿงฎ Transfer Learning under High-Dimensional Generalized Linear Models

Ye Tian & Yang Feng, Journal of the American Statistical Association (JASA), 2023
๐Ÿ“„ Paperโ€ƒ|โ€ƒ๐Ÿ’ป R Package: glmtrans

Summary
This work develops a principled framework for transfer learning in high-dimensional GLMs, where multiple related source datasets can inform a target task. It proposes a two-step TransGLM procedure to automatically determine which sources are informative, mitigating negative transfer.

Highlights

  • Introduces a transferable source detection mechanism to adaptively select useful sources.
  • Provides non-asymptotic bounds for estimation and prediction errors.
  • Establishes theoretical conditions under which transfer learning yields provable benefits.
  • Includes a method for valid post-transfer inference (confidence intervals).

๐Ÿงฉ Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models

Ye Tian, Haolei Weng, Lucy Xia, & Yang Feng, arXiv preprint, 2024
๐Ÿ“„ Paper โ€ƒ|โ€ƒ๐Ÿ’ป R Package: mtlgmm

Summary
This paper studies unsupervised MTL/TL for Gaussian Mixture Models (GMMs), proposing a robust and theoretically grounded approach that jointly learns across multiple mixture tasks.

Highlights

  • A multi-task EM algorithm that learns shared latent structures across tasks.
  • Robustness against outlier tasks with arbitrary distributions.
  • Finite-sample guarantees and minimax-optimal estimation rates.
  • Theoretical resolution of label-alignment issues across tasks.

๐ŸŒ Towards the Theory of Unsupervised Federated Learning

Ye Tian, Haolei Weng, & Yang Feng, ICML 2024
๐Ÿ“„ Paper | ๐Ÿ’ป Python package: FedGrEM

Summary
This paper provides one of the first non-asymptotic analyses of federated EM algorithms for unsupervised learning. It bridges the gap between theory and practice in federated expectation-maximization across heterogeneous clients.

Highlights

  • Finite-sample convergence guarantees for federated EM under heterogeneity.
  • Addresses label alignment and initialization issues unique to unsupervised federated setups.
  • Empirically validated across distributed mixture models.

๐Ÿง  Learning from Similar Linear Representations: Adaptivity, Minimaxity, and Robustness

Ye Tian, Yuqi Gu, & Yang Feng, Journal of Machine Learning Research (JMLR), 2025
๐Ÿ“„ Paper | ๐Ÿ’ป Python package: RL-MTL-TL

Summary
This work provides a general theory for learning from similar but not identical linear representations. It develops adaptive algorithms that automatically determine how much to share across tasks, achieving robustness and minimax optimality.

Highlights

  • Introduces an adaptive penalized ERM framework for shared representations.
  • Characterizes regimes of beneficial transfer vs negative transfer.
  • Achieves minimax rates and adapts seamlessly to unknown similarity levels.
  • Robust to outlier tasks and distributional shifts.

๐Ÿงญ Geometry-Aware Multi-Task Representation Learning

Aoran Chen & Yang Feng, arXiv preprint, 2025
๐Ÿ“„ Paper | ๐Ÿ’ป Python package: GeoERM

Summary
This paper introduces GeoERM, a geometry-aware multi-task learning (MTL) framework that respects the intrinsic Riemannian geometry of the representation space. Rather than treating the shared representation as a point in Euclidean space, the method performs optimization on the appropriate manifold, combining:

  1. Riemannian gradient steps that align with the curvature of the search space,
  2. Polar retraction to ensure that updates remain on the manifold.

GeoERM incurs similar per-iteration computational cost to Euclidean baselines but offers enhanced robustness under task heterogeneity and adversarial label noise. Experiments on synthetic and real datasets (e.g. wearable-sensor activity recognition) show improved estimation accuracy and reduced negative transfer.

Highlights

  • Embeds the latent representation on its natural Riemannian manifold and operates via manifold-aware updates.
  • Demonstrates resilience to heterogeneity and adversarial noise.
  • Retains computational efficiency comparable to standard Euclidean MTL methods.
  • Empirically outperforms leading MTL and single-task baselines.

๐Ÿ”’ Federated Transfer Learning with Differential Privacy

Mengchu Li, Ye Tian, Yang Feng, & Yi Yu, arXiv preprint, 2024
๐Ÿ“„ Paper

Summary
This study integrates federated transfer learning with differential privacy (DP) to enable knowledge transfer across domains without compromising user data confidentiality.

Highlights

  • A noise-calibrated federated transfer algorithm with formal DP guarantees.
  • Tight theoretical characterization of privacyโ€“utility trade-offs.
  • Demonstrates practical feasibility for privacy-sensitive health and financial data.

If you would like to learn more about my work or explore code, slides, and related materials, please check my full list of publications on the Publications page or contact me.