Sizhe Chen (陈思哲)
Biography
Hi! I am a CS Ph.D. candidate at UC Berkeley in Berkeley AI Research (BAIR), where I am fortunately advised by Prof. David Wagner, with additional thesis committee members: Raluca Ada Popa, Sewon Min, and Eric Wallace. Supported by Google- and Meta-BAIR Commons, I am working with Chawin Sitawari and Arman Zharmagambetov, and have worked with Chuan Guo and Nicholas Carlini. I got my M.Eng. (National Scholarship) and B.Eng. (Summa Cum Laude) from Shanghai Jiao Tong University, working with Prof. Xiaolin Huang.
I study AI security in real-world applications. Currently, I am defending against prompt injection attacks, the top-1 threat to AI agents. Prompt injection has caused actual harm on multiple AI systems from Google, OpenAI, Anthropic, Slack, etc. To open up broader usage of LLMs in agents, I develop principled, general, and practical prompt injection defenses. Our state-of-the-art open robust LLMs, Meta-SecAlign (ready for commercial usage), have an order of magnitude less attack success rates against various prompt injections, and have been downloaded 10K times in 3 months.
I am fortunate to have mentored or worked with lots of talented students: Yizhu Wang, Jing Qian, Shutong Wu, Zhixing Ye, etc. With David Wagner, I am looking for a 2026 Summer intern who is applying to 2027 Fall PhD programs.
Invited Talks
- Securing LLMs Against Prompt Injection for Agentic Applications
Penn State University: Guest Lecture at Threats and Cybersecurity 2025
Cornell University (Cornell-Tech Campus): Guest Lecture at Trustworthy AI 2025
Google DeepMind: Adversarial Machine Learning Seminar 2025
Duke University: Guest Lecture at Generative AI: Foundations, Applications, and Safety 2025
UC Berkeley: Security Seminar 2024
Hong Kong Baptist University: TMLR Young Scientist Seminar 2024
Shanghai Jiao Tong University: PAMI Group Seminar 2024 - On the Learning Preference of Deep Neural Networks
ICLR Oral Track 2023
AI Time Youth Ph.D. Talk 2023 - Subspace Adversarial Training
CVPR Oral Track 2022 - Adversarial Attacks and Defenses
Northeastern University: Security Seminar 2022
Selected Publications
- Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks
Sizhe Chen*, Arman Zharmagambetov, David Wagner, Chuan Guo*
Meta-SecAlign-70B is the first open-source LLM with built-in prompt injection defense and commercial-grade performance, and greatly outperforms gpt-4o and gemini-2.5-flash and is comparable to gpt-5 in agentic (tool/web) utility and security. - SecAlign: Defending Against Prompt Injection with Preference Optimization
Sizhe Chen, Arman Zharmagambetov, Saeed Mahloujifar, Kamalika Chaudhuri, David Wagner, Chuan Guo
SecAlign aims at a prompt-injection-robust LLM that prefers (and thus output) the secure response over the insecure one. - StruQ: Defending Against Prompt Injection with Structured Queries
Sizhe Chen, Julien Piet, Chawin Sitawarin, David Wagner
StruQ is a general framework for prompt injection defense by separating the prompt (user instruction) and data into two channels. - Defending Against Prompt Injection with DataFilter
Yizhu Wang, Sizhe Chen, Raghad Alkhudair, Basel Alomair, David Wagner
- Defending Against Prompt Injection with a Few DefensiveTokens
Sizhe Chen, Yizhu Wang, Nicholas Carlini, Chawin Sitawarin, David Wagner
- One-Pixel Shortcut: On the Learning Preference of Deep Neural Networks
Shutong Wu*, Sizhe Chen*, Cihang Xie, Xiaolin Huang
- Universal Adversarial Attack on Attention and the Resulting Dataset DAmageNet
Sizhe Chen, Zhengbao He, Chengjin Sun, Jie Yang, Xiaolin Huang
- Subspace Adversarial Training
Tao Li, Yingwen Wu, Sizhe Chen, Kun Fang, Xiaolin Huang
Services
- Reviewer: CCS 2024/2025/2026, SaTML 2025/2026, NeurIPS 2023/2025, ICML 2024/2025, ICLR 2023/2024/2025/2026, CVPR 2023/2024/2025, ICCV 2023, ECCV 2022/2024, IEEE TPAMI, Machine Learning, Pattern Recognition
- UC Berkeley EECS Student Reviewer: Faculty Hiring Committee 2024, Ph.D. Admission Committee 2024, Equal Access to Application Assistance 2024
Awards
- Research Fundings: Meta-BAIR Commons 2024-2026, Google-BAIR Commons 2024-2026, UC Berkeley EECS Departmental Fellowship 2023, NeurIPS 2022 and ICLR 2023 Travel Support
- Degree Awards: SJTU Best Bachelor’s Thesis (1%) 2020, SJTU Outstanding Graduate 2022/2023
- Scholarship: China National Scholarship (0.2%) 2021/2022, Kwang-Hua Scholarship 2019, Arawana Scholarship 2017
Misc
- I practice neatness and minimalism.
- I love to sing, attend concerts, photograph, hike, ski, play badminton and table tennis.
- I write blogs (in Chinese yet) about my thoughts and experience.
- My Erdös number is 3 due to my collaboration with Chuan Guo.

