Yang_Feng.jpg

708 Broadway, Room 415

Department of Biostatistics

New York University

New York, NY 10003

Email: yf31@nyu.edu



Google Scholar Profile

Professor · Biostatistics

Yang Feng

Statistical machine learning and high-dimensional inference for biomedical discovery.

New York University · School of Global Public Health · Affiliate, Center for Data Science

Overview

I am a Professor of Biostatistics in the School of Global Public Health at New York University and an affiliated faculty member at the Center for Data Science. My research focuses on the theoretical and methodological foundations of machine learning, high-dimensional statistics, network models, and nonparametric methods, with applications in Alzheimer’s disease prognosis, cancer subtype classification, genomics, electronic health records, and biomedical imaging.

I lead the Feng Lab, which is dedicated to advancing statistical learning, data science, and AI through rigorous research and impactful applications. For a short, conference-style biography, see the Bio page.

📢 Message to Prospective PhD Students

Thank you very much for your interest in pursuing a PhD and for considering working with me. Please note that PhD admissions at NYU Biostatistics are handled at the departmental level rather than by individual faculty. You are welcome to mention my name in your application if you believe our research interests align. Due to the large volume of inquiries I receive each year, I sincerely apologize that I am not able to respond to individual inquiry emails. I appreciate your understanding and wish you the very best in your application process.

Machine Learning

Transfer, multi-task and federated learning; Neyman–Pearson classification; causal inference; deep learning.

High-Dimensional Statistics

Variable selection and screening, Gaussian graphical models, inference under high dimensions.

Network Models

Community detection, network embedding, statistical inference on graphs.

Applications

Electronic health records, genomics, epidemiology, neuroscience, social networks, computer vision.

Semiparametric Modeling and Analysis for Longitudinal Network Data

A semiparametric framework for networks observed over time that lets connection patterns evolve smoothly rather than forcing every edge to follow a fixed parametric law.

AoS · 2025

Neyman-Pearson Multi-Class Classification via Cost-Sensitive Learning

Extends the Neyman-Pearson paradigm — controlling the specific error you care about most — from binary to many classes, by recasting it as a cost-sensitive learning problem with guarantees on the user-specified error budget.

JASA · 2024

Transfer Learning under High-Dimensional Generalized Linear Models

Borrow strength from related high-dimensional generalized linear models to estimate a target model with provably minimax-optimal rates, together with a data-driven test that filters out source datasets that would actually hurt.

JASA · 2023

PCABM: Pairwise Covariates-Adjusted Block Model for Community Detection

A community-detection model that accounts for node-pair covariates partly explaining who connects to whom, blending block structure with regression-style adjustment for the part of connectivity the covariates already explain.

JASA · 2023

Testing Community Structure for Hypergraphs

A likelihood-ratio test for whether a hypergraph contains more than one community, with sharp boundaries on when communities are statistically detectable that improve over methods designed only for ordinary graphs.

AoS · 2022

RaSE: Random Subspace Ensemble Classification

A flexible high-dimensional classification framework that aggregates many randomly sampled low-dimensional subspace classifiers, with consistency theory and a CRAN package.

JMLR · 2021

Neyman-Pearson Classification Algorithms and NP Receiver Operating Characteristics

A practical algorithm that controls the Type-I error of any classifier with prescribed high probability, plus an NP-ROC curve for comparing classifiers under asymmetric error costs.

Sci. Adv. · 2018

Model Selection for High-Dimensional Quadratic Regression via Regularization

A regularized model-selection method for high-dimensional regression with interaction terms that respects the strong-heredity principle: an interaction is only chosen when both its main effects are also present. A weak-heredity version is also proposed that only requires one of the main effects to be present.

JASA · 2018

Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models

Extends sure-independence screening to additive nonparametric models, drastically reducing ultra-high-dimensional feature pools while retaining all the truly relevant variables with high probability.

JASA · 2011

Network Exploration via the Adaptive LASSO and SCAD Penalties

Estimate the structure of a sparse Gaussian graphical model — the network of conditional dependencies among many variables — using adaptive LASSO and SCAD penalties, with oracle-property guarantees on variable selection.

AoAS · 2009
🔬

Join the Feng Lab

The lab brings together PhD students, postdocs, and collaborators working on the statistical foundations of machine learning and their biomedical applications. Recent alumni have placed at top universities and industry research labs.

Lab page & openings →

News

Latest Posts

Academic Appointments
Education
  • PhD in Operations Research, Princeton University (2006–2010)
  • BS in Mathematics, University of Science and Technology of China (2002–2006)
Editorial Activities
Selected Honors and Awards
  • Faculty of the Year Award, NYU School of Global Public Health, 2025
  • Teaching Excellence Award, NYU School of Global Public Health, 2024
  • Fellow, Institute of Mathematical Statistics (IMS), 2023
  • Fellow, American Statistical Association (ASA), 2022
  • Elected Member, International Statistical Institute (ISI), 2017
  • NSF CAREER Award, National Science Foundation (NSF), 2016
Research Support & Grants
  • NSF Grant DMS-2324489: Collaborative Research: New Theory and Methods for High-Dimensional Multi-Task and Transfer Learning Inference