Home | Hengguang Zhou

About me

I am currently a PhD student at UCLA advised by Prof. Cho-Jui Hsieh, and a project lead at TurningPointAI, advised by Ruochen Wang, Prof. Minhao Cheng, Prof. Tianyi Zhou. We are a collaborative initiative dedicated to advancing the field of Multimodal Language Agents. Learn more about our work at TurningPointAI and stay updated by following us on twitter.

Previously, I was fortune to work with Prof. YueGao at Tsinghua University.

My research is primarily focused on Multimodal Large Language Models Reasoning. Prior to the era of LLM, I had experiences working on 3D Computer Vision, Human-Computer Interaction (HCI), and visually-rich document understanding. I hold a B.S degree in Computer Science from the University of Toronto.

Updates

We released the first successful replication of DeepSeek-R1's 'aha moment' in a multimodal task using only a 2B non-SFT model! - February 26, 2025

Our submisson on multimodal oversensitivity is accepted in ICLR - January 22, 2025

We presented the first test suite assessing if current MLLMs overreact to benign queries! - June 22, 2024

Publications(First Author)

R1-Zero’s “Aha Moment” in Visual Reasoning on a 2B Non-SFT Model

Hengguang Zhou*, Xirui Li*, Ruochen Wang, Minhao Cheng, Tianyi Zhou, Cho-Jui Hsieh

Project Page Blog We are the first to replicate the r1-like "aha moment" on multimodal task with only a 2b model! Presented by TurningPointAI

MOSSBench: Is Your Multimodal Large Language Model Oversensitive to Safe Queries?

Xirui Li*, Hengguang Zhou*, Ruochen Wang, Tianyi Zhou, Minhao Cheng, Cho-Jui Hsieh

ICLR, 2025.

PDF Code Project Page TurningPointAI, Accepted by ICLR 2025