Jianshu Zhang

About Me

Hi! I’m Jianshu Zhang (张鉴殊), a first-year CS Ph.D. student at Northwestern University, fortunate to be co-advised by Prof. Manling Li and Prof. Han Liu. I am passionate about building intelligent systems that understand and act in the physical world through multimodal reasoning. I love collaborating on ambitious ideas—feel free to reach out if you’d like to chat.

Research Interests

How VLMs can truly step into our world?

Where are the bottlenecks? VLMs inherit language-model data biases: text corpora are abundant, high-quality multimodal corpora are scarce. To address this imbalance, I pursue data-centric approaches that automatically curate rich, high-quality multimodal data so models gain the grounding they lack.
How should VLMs learn? Vision signals are continuous and environments are dynamic, making alignment difficult. Rather than relying on static training signals, I focus on context-sensitive multimodal learning so VLMs can adapt their behavior to diverse settings through in-context cues.
How can VLMs interact with the physical world? Moving from perception to action requires more than recognition — VLMs need to understand spatial structure and leverage past experience. With this goal, I study spatial intelligence and memory-driven action planning so VLM-based agents can translate self-exploration into meaningful actions while navigating real spaces.

News

2025.09 Honored to receive McCormick School of Engineering Fellowship from Northwestern University.
2025.09 Joined Northwestern University as a Ph.D. student in Computer Science.
2025.08 Evo-MARL and FairReason were accepted to T2FM@ICCV 2025.
2025.08 WebCoT was accepted to EMNLP 2025 (Findings) and AIA@COLM 2025.
2025.06 MultiVerse was accepted to ICCV 2025.
2025.06 🎓 Graduated with a B.E. degree and selected as Outstanding Undergraduate.
2025.05 VLM2-Bench was accepted to ACL 2025 (Main), and Bridge-Coder to ACL 2025 (Findings).
2025.05 Honored with the Lei Jun Computer Breakthrough Award (50K RMB).
2025.05 CAN was accepted to ICML 2025.
2025.01 PVIT was accepted to ICLR 2025.
2024.11 PVIT-3M dataset ranked Top 3 in downloads on Hugging Face.
2024.10 Awarded the National Scholarship (top 0.2% nationally).
2024.09 Image Textualization accepted to NeurIPS 2024 (D&B).
2024.09 MLLM-Protector and FIRST accepted to EMNLP 2024 (Main).
2024.03 CORE accepted to CogSci 2024 (Oral).
2023.12 FuzzLLM accepted to ICASSP 2024.

Publications

VLM2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues

Jianshu Zhang*, Dongyu Yao*, Renjie Pi, Paul Pu Liang, Yiren Fung

ACL 2025 (Main)

Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents

Rui Wang, Ce Zhang, Jun-Yu Ma, Jianshu Zhang*, Hongru Wang, Yi Chen, Boyang Xue, Tianqing Fang, Zhisong Zhang, Hongming Zhang, Haitao Mi, Dong Yu, Kam-Fai Wong

Arxiv

Spatial Mental Modeling from Limited Views

Baiqiao Yin, Qineng Wang, Pingyue Zhang, Jianshu Zhang, Kangrui Wang, Zihan Wang, Jieyu Zhang, Keshigeyan Chandrasegaran, Han Liu, Ranjay Krishna, Saining Xie, Manling Li, Jiajun Wu, Li Fei-Fei

ICCV 2025 Workshop SP4V (Best Paper Award)

MultiVerse: A Multi-Turn Conversation Benchmark for Evaluating Large Vision and Language Models

Young-Jun Lee, Byung-Kwan Lee, Jianshu Zhang, Yechan Hwang, Byungsoo Ko, Han-Gyu Kim, Dongyu Yao, Xuankun Rong, Eojin Joo, Seung-Ho Han, Bowon Ko, Ho-Jin Choi

ICCV 2025; ICCV 2025 Workshop KnowledgeMR

Personalized Visual Instruction Tuning

Renjie Pi*, Jianshu Zhang*, Tianyang Han, Jipeng Zhang, Rui Pan, Tong Zhang

ICLR 2025

WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback

Minda Hu, Tianqing Fang, Jianshu Zhang, Junyu Ma, Zhisong Zhang, Jingyan Zhou, Hongming Zhang, Haitao Mi, Dong Yu, Irwin King

EMNLP 2025 (Findings); COLM 2025 Workshop AIA

CAN: Leveraging Clients as Navigators for Generative Replay in Federated Continual Learning

Xuankun Rong*, Jianshu Zhang*, He Kun, Mang Ye

ICML 2025

Bridge-Coder: Transferring Model Capabilities from High-Resource to Low-Resource Programming Language

Jipeng Zhang*, Jianshu Zhang*, Yuanzhe Li*, Renjie Pi, Rui Pan, Runtao Liu, Zheng Ziqiang, Tong Zhang

ACL 2025 (Findings)

Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions

Renjie Pi*, Jianshu Zhang*, Jipeng Zhang, Rui Pan, Zhekai Chen, Tong Zhang

NeurIPS 2024

MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance

Renjie Pi*, Tianyang Han*, Jianshu Zhang*, Yueqi Xie, Rui Pan, Qing Lian, Hanze Dong, Jipeng Zhang, Tong Zhang

EMNLP 2024 (Main)

FuzzLLM: A Novel and Universal Fuzzing Framework for Proactively Discovering Jailbreak Vulnerabilities in Large Language Models

Dongyu Yao*, Jianshu Zhang*, Ian G. Harris, Marcel Carlsson

ICASSP 2024; Presented at ShmooCon 2024 (a top-tier American hacker convention)

FIRST: Teach A Reliable Large Language Model Through Efficient Trustworthy Distillation

KaShun Shum*, Minrui Xu*, Jianshu Zhang*, Zixin Chen, Shizhe Diao, Hanze Dong, Jipeng Zhang, Muhammad Omer Raza

EMNLP 2024 (Main)

CORE: Mitigating Catastrophic Forgetting in Continual Learning through Cognitive Replay

Jianshu Zhang*, Yankai Fu*, Ziheng Peng*, Dongyu Yao, Kun He

CogSci 2024 (Oral)

Awards

McCormick School of Engineering Fellowship (~46K USD)
National Scholarship
Lei Jun Computer Breakthrough Award (50K RMB)
Outstanding Undergraduate
First-Class Scholarship (ranked 1st)
Merit Student

Education

Northwestern University, Ph.D. in Computer Science (2025 – 2030)
Wuhan University (2021 – 2025)
Shenzhen Middle School (2018 – 2021)

Misc

I’m grateful for the mentorship of Prof. Tong Zhang (UIUC), Prof. Paul Liang (MIT), and Prof. Yiren Fung (HKUST). Outside of research, I enjoy basketball🏀, billiards🎱, table tennis🏓, swimming🏊, electric guitar🎸, and the occasional power nap😴 to keep me energized.