Biography

I am a tenure-track assistant professor of Computer Science and UMIACS at University of Maryland, College Park. My research interests are in machine learning, optimization, and natural language processing. I am part of the Cetenr for Machine Learning (CML) and CLIP Lab at UMIACS. I have published ~100 papers in ML (NeurIPS, ICML, ICLR), NLP (ACL, EMNLP, NAACL), CV (CVPR, ICCV, ECCV), DM (KDD, ICDM), AI (AAAI, IJCAI) conferences, and journals as Machine Learning (Springer), IEEE TPAMI/TIP/TNNLS/TKDE, etc.

Our recent works study (1) How, why, and when to transfer human learning (e.g., curriculum, retention, sub-tasking, curiosity, exemplar selection, collaboration, etc.) to improve machine learning and generalization in the wild (e.g., with unlabeled, biased, noisy, redundant, or distributed data, in unseen tasks/environments); (2) Controllable Generative AI in both training and inference/adaptation; (3) Synthetic data, self-evolving AI, and auto-benchmarking; and (4) Human-AI teaming and hybrid agent with personalization. We are developing these methods with LLMs, multi-modality foundation models, and RL. Our goal is to develop efficient, versatile, trustworthy, and environmentally-friendly hybrid-intelligence based on coevolution between human and machine. The code/data/models can be found at Tianyi Lab’s GitHub and HF.

I was a visiting research scientist at Google between 2021-2022. Before that, I received my Ph.D. (thesis) from Computer Science of University of Washington, where I was a member of MELODI lab led by Prof. Jeff A. Bilmes. I have been working with Prof. Dacheng Tao as a research assistant at University of Technology, Sydney (UTS) and Nanyang Technological University. I was a research intern at Yahoo! Labs, mentored by Dr. Hua Ouyang (Apple) and Prof. Yi Chang (Jilin University), and a research intern at Microsoft Research, mentored by Dr. Lin Xiao (Meta AI).

News

Research Topics

  • Machine Learning (2008-present)
    1. Learning over time: Curriculum Learning, Continual Learning
    2. Learning via interactions: Reinforcement Learning, Online Learning
    3. Learning across tasks/domains: Multi-task Learning, Meta-Learning, Domain Adaptation/Generalization
    4. Learning multiple models: Mixture-of-Experts (MoE), Collaborative/Cooperative Learning, Federated/Decentralized Learning
    5. Learning under noises: Noisy-Label Learning, Adversarial Learning
    6. Learning representations: Self-Supervised Learning, Dimension Reduction
    7. Sparse Learning: Compressed Sensing, Matrix Factorization, Spectral Method
    8. Optimization: Continuous, Combinatorial, Multi-Objective, Zeroth-order
    9. Controllable Generative AI
  • Natural Language Processing (2016-present)
    1. Attention mechanisms: DiSAN, BiBloSA
    2. Data Engineering (selection, exploration, synthesis) for Large language models (LLMs) training: Reflection-Tuning, SuperFiltering, Alpagasus, Cherry LLM, Mosaic-IT
    3. LLM Agents, NeuroSymbolic World Models: Wall-E, DynaSaur
    4. Personalization and Human-AI Alignment: DEBATunE, MCTune, CAIMIRA
    5. Prompt Optimization: InstructZero, MoP
    6. In-Context Learning: BenTo, Div-S3
    7. Embedding: MoEE, MetaEOL
    8. Adversarial attack and defense(Jailbreak, Unlearning, etc.): DrAttack
  • Multi-modality Models (2021-present)
    1. Vision-Language Models and Dense Alignment across modalities: Florence-VL
    2. VLM + RL, Multi-modality Embodied-AI: EMMA, CoTASP
    3. Multi-modal Generative Agents: MuLan
    4. Hallucinations, Illusions, Oversensitivity: HallusionBench, AutoHallusion, MOSSBench