Kai Chen avatar

Yunhao Gou (苟耘豪)

Ph.D. Candidate @ HKUST

About Me

I am currently a Ph.D. candidate in CSE department of Hong Kong University of Science and Technology (HKUST), supervised jointly by Prof. James T. Kwok and Prof. Yu Zhang. Previously, I was an undergraduate student majoring in Software Engineering in University of Electronic Science and Technology of China (UESTC).

My current research interests include:

  • Harmfulness of MLLM/LLM: ECSO
  • Mixture of Experts (MoE): MoCLE
  • Vision and Language Representation Learning: EPIC, HGR-Net, RSAN

News
  • [2024.03] Code and checkpoints of MoCLE and ECSO have been released. Welcome to try!
  • [2024.03] Our work ECSO, the first work that makes MLLM safe without neither training nor any external models, is on Arxiv!
  • [2023.12] Our work MoCLE is reported by QbitAI
  • [2023.12] Our work MoCLE, the first MLLM with MoE architecture for instruction customization and generalization, is on Arxiv!
  • [2023.02] One paper accepted by CVPR 2023!
  • [2022.09] Join HKUST CSE for PhD study.
  • [2022.01] Joint Bytedance AILab as an intern researcher.
  • [2022.07] One paper accepted by ECCV 2022!
  • [2021.07] One paper accepted by CIKM 2021!
Selected Publications

Full publication list on Google Scholar. (* denotes equal contribution)

ecso.png

Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation

Yunhao Gou*, Kai Chen*, Zhili Liu*, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-Yan Yeung, James Kwok, Yu Zhang.

1) Make MLLM safe without neither training nor any external models!

2) Free data engine for MLLM alignment on its own!

Arxiv preprint, 2024.

[PDF] [Project page]
mocle.png

Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning

Yunhao Gou*, Zhili Liu*, Kai Chen*, Lanqing Hong, Hang Xu, Aoxue Li, Dit-Yan Yeung, James Kwok, Yu Zhang.

First MLLM with MoE for instruction customization and generalization!

Arxiv preprint, 2023.

[PDF] [Project page] [Wechat Post] [Talk]
epic.png

Leveraging per Image-Token Consistency for Vision-Language Pre-training

Yunhao Gou, Tom Ko, Hansi Yang, James Kwok, Yu Zhang, Mingxuan Wang.

IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2023.

[PDF]
hgrnet.png

Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification

Kai Yi, Xiaoqian Shen, Yunhao Gou, Mohamed Elhoseiny.

European Conference on Computer Vision (ECCV), 2022.

[PDF] [Project page]
rsan.png

Region semantically aligned network for zero-shot learning

Ziyang Wang*, Yunhao Gou*, Jingjing Li, Yu Zhang, Yang Yang

International Conference on Information & Knowledge Management (CIKM), 2021.

[PDF]
Talks
  • [TechBeat Online] Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning. [Recording]
Experiences
ByteDance AI Lab
Jan. 2022 - Mar. 2023
Research Intern, working with Tom Ko
National University of Singapore
July 2019 - Aug. 2019
International exchange student
Selected Awards

Research Travel Grant HKUST

2023

Postgraduate Scholarship HKUST

2022

Oversea Visiting Student Stipend of UESTC

2019

National Scholarship

2019