Transfer Learning and Multi-task Learning

Overview

This page summarizes several of my recent works on Transfer Learning (TL) and Multi-task Learning (MTL), focusing on theoretical guarantees, algorithmic design, and applications to high-dimensional, unsupervised, and federated settings.
These works aim to answer a common question:

How can we safely and efficiently borrow information across related tasks or domainsโ€”while avoiding negative transfer and preserving privacy?

Transfer Learning and Multi-task Learning


๐Ÿงฎ Transfer Learning under High-Dimensional Generalized Linear Models

Ye Tian & Yang Feng, Journal of the American Statistical Association (JASA), 2023
๐Ÿ“„ Paperโ€ƒ|โ€ƒ๐Ÿ’ป R Package: glmtrans

Summary
This work develops a principled framework for transfer learning in high-dimensional GLMs, where multiple related source datasets can inform a target task. It proposes a two-step TransGLM procedure to automatically determine which sources are informative, mitigating negative transfer.

Highlights

  • Introduces a transferable source detection mechanism to adaptively select useful sources.
  • Provides non-asymptotic bounds for estimation and prediction errors.
  • Establishes theoretical conditions under which transfer learning yields provable benefits.
  • Includes a method for valid post-transfer inference (confidence intervals).

๐Ÿงฉ Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models

Ye Tian, Haolei Weng, Lucy Xia, & Yang Feng, arXiv preprint, 2024
๐Ÿ“„ Paper โ€ƒ|โ€ƒ๐Ÿ’ป R Package: mtlgmm

Summary
This paper studies unsupervised MTL/TL for Gaussian Mixture Models (GMMs), proposing a robust and theoretically grounded approach that jointly learns across multiple mixture tasks.

Highlights

  • A multi-task EM algorithm that learns shared latent structures across tasks.
  • Robustness against outlier tasks with arbitrary distributions.
  • Finite-sample guarantees and minimax-optimal estimation rates.
  • Theoretical resolution of label-alignment issues across tasks.

๐ŸŒ Towards the Theory of Unsupervised Federated Learning

Ye Tian, Haolei Weng, & Yang Feng, ICML 2024
๐Ÿ“„ Paper | ๐Ÿ’ป Python package: FedGrEM

Summary
This paper provides one of the first non-asymptotic analyses of federated EM algorithms for unsupervised learning. It bridges the gap between theory and practice in federated expectation-maximization across heterogeneous clients.

Highlights

  • Finite-sample convergence guarantees for federated EM under heterogeneity.
  • Characterizes trade-offs between communication cost and estimation accuracy.
  • Addresses label alignment and initialization issues unique to unsupervised federated setups.
  • Empirically validated across distributed mixture models.

๐Ÿง  Learning from Similar Linear Representations: Adaptivity, Minimaxity, and Robustness

Ye Tian, Yuqi Gu, & Yang Feng, Journal of Machine Learning Research (JMLR), 2025
๐Ÿ“„ Paper | ๐Ÿ’ป Python package: RL-MTL-TL

Summary
This work provides a general theory for learning from similar but not identical linear representations. It develops adaptive algorithms that automatically determine how much to share across tasks, achieving robustness and minimax optimality.

Highlights

  • Introduces an adaptive penalized ERM framework for shared representations.
  • Characterizes regimes of beneficial transfer vs negative transfer.
  • Achieves minimax rates and adapts seamlessly to unknown similarity levels.
  • Robust to outlier tasks and distributional shifts.

๐Ÿงญ Geometry-Aware Multi-Task Representation Learning

Aoran Chen & Yang Feng, arXiv preprint, 2025
๐Ÿ“„ Paper | ๐Ÿ’ป Python package: GeoERM

Summary
This paper introduces GeoERM, a geometry-aware multi-task learning (MTL) framework that respects the intrinsic Riemannian geometry of the representation space. Rather than treating the shared representation as a point in Euclidean space, the method performs optimization on the appropriate manifold, combining:

  1. Riemannian gradient steps that align with the curvature of the search space,
  2. Polar retraction to ensure that updates remain on the manifold.

GeoERM incurs similar per-iteration computational cost to Euclidean baselines but offers enhanced robustness under task heterogeneity and adversarial label noise. Experiments on synthetic and real datasets (e.g. wearable-sensor activity recognition) show improved estimation accuracy and reduced negative transfer.

Highlights

  • Embeds the latent representation on its natural Riemannian manifold and operates via manifold-aware updates.
  • Demonstrates resilience to heterogeneity and adversarial noise.
  • Retains computational efficiency comparable to standard Euclidean MTL methods.
  • Empirically outperforms leading MTL and single-task baselines.

๐Ÿ”’ Federated Transfer Learning with Differential Privacy

Mengchu Li, Ye Tian, Yang Feng, & Yi Yu, arXiv preprint, 2024
๐Ÿ“„ Paper

Summary
This study integrates federated transfer learning with differential privacy (DP) to enable knowledge transfer across domains without compromising user data confidentiality.

Highlights

  • A noise-calibrated federated transfer algorithm with formal DP guarantees.
  • Tight theoretical characterization of privacyโ€“utility trade-offs.
  • Demonstrates practical feasibility for privacy-sensitive health and financial data.

References