Sizhe Chen (陈思哲)

Biography

Hi! I am a CS Ph.D. candidate at UC Berkeley in Berkeley AI Research (BAIR), where I am fortunately advised by Prof. David Wagner, with additional thesis committee members: Raluca Ada Popa, Sewon Min, and Eric Wallace. Supported by Google- and Meta-BAIR Commons, I am working with Chawin Sitawari and Arman Zharmagambetov, and have worked with Chuan Guo and Nicholas Carlini. I got my M.Eng. (National Scholarship) and B.Eng. (Summa Cum Laude) from Shanghai Jiao Tong University, working with Prof. Xiaolin Huang.

I study AI security in real-world applications. Currently, I am defending against prompt injection attacks, the top-1 threat to AI agents. Prompt injection has caused actual harm on multiple AI systems from Google, OpenAI, Anthropic, Slack, etc. To open up broader usage of LLMs in agents, I develop principled, general, and practical prompt injection defenses. Our state-of-the-art open robust LLMs, Meta-SecAlign (ready for commercial usage), have an order of magnitude less attack success rates against various prompt injections, and have been downloaded 10K times in 3 months.

I am fortunate to have mentored or worked with lots of talented students: Yizhu Wang, Jing Qian, Shutong Wu, Zhixing Ye, etc. With David Wagner, I am looking for a 2026 Summer intern who is applying to 2027 Fall PhD programs.

Invited Talks

  • Securing LLMs Against Prompt Injection for Agentic Applications
    Penn State University: Guest Lecture at Threats and Cybersecurity 2025
    Cornell University (Cornell-Tech Campus): Guest Lecture at Trustworthy AI 2025
    Google DeepMind: Adversarial Machine Learning Seminar 2025
    Duke University: Guest Lecture at Generative AI: Foundations, Applications, and Safety 2025
    UC Berkeley: Security Seminar 2024
    Hong Kong Baptist University: TMLR Young Scientist Seminar 2024
    Shanghai Jiao Tong University: PAMI Group Seminar 2024
  • On the Learning Preference of Deep Neural Networks
    ICLR Oral Track 2023
    AI Time Youth Ph.D. Talk 2023
  • Subspace Adversarial Training
    CVPR Oral Track 2022
  • Adversarial Attacks and Defenses
    Northeastern University: Security Seminar 2022

Selected Publications

  • Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks
    Sizhe Chen*, Arman Zharmagambetov, David Wagner, Chuan Guo*

    Meta-SecAlign-70B is the first open-source LLM with built-in prompt injection defense and commercial-grade performance, and greatly outperforms gpt-4o and gemini-2.5-flash and is comparable to gpt-5 in agentic (tool/web) utility and security.
  • SecAlign: Defending Against Prompt Injection with Preference Optimization
    Sizhe Chen, Arman Zharmagambetov, Saeed Mahloujifar, Kamalika Chaudhuri, David Wagner, Chuan Guo
    M
    SecAlign aims at a prompt-injection-robust LLM that prefers (and thus output) the secure response over the insecure one.
  • StruQ: Defending Against Prompt Injection with Structured Queries
    Sizhe Chen, Julien Piet, Chawin Sitawarin, David Wagner

    StruQ is a general framework for prompt injection defense by separating the prompt (user instruction) and data into two channels.
  • Defending Against Prompt Injection with DataFilter
    Yizhu Wang, Sizhe Chen, Raghad Alkhudair, Basel Alomair, David Wagner

  • Defending Against Prompt Injection with a Few DefensiveTokens
    Sizhe Chen, Yizhu Wang, Nicholas Carlini, Chawin Sitawarin, David Wagner
  • One-Pixel Shortcut: On the Learning Preference of Deep Neural Networks
    Shutong Wu*, Sizhe Chen*, Cihang Xie, Xiaolin Huang
  • Universal Adversarial Attack on Attention and the Resulting Dataset DAmageNet
    Sizhe Chen, Zhengbao He, Chengjin Sun, Jie Yang, Xiaolin Huang
  • Subspace Adversarial Training
    Tao Li, Yingwen Wu, Sizhe Chen, Kun Fang, Xiaolin Huang

Services

  • Reviewer: CCS 2024/2025/2026, SaTML 2025/2026, NeurIPS 2023/2025, ICML 2024/2025, ICLR 2023/2024/2025/2026, CVPR 2023/2024/2025, ICCV 2023, ECCV 2022/2024, IEEE TPAMI, Machine Learning, Pattern Recognition
  • UC Berkeley EECS Student Reviewer: Faculty Hiring Committee 2024, Ph.D. Admission Committee 2024, Equal Access to Application Assistance 2024

Awards

  • Research Fundings: Meta-BAIR Commons 2024-2026, Google-BAIR Commons 2024-2026, UC Berkeley EECS Departmental Fellowship 2023, NeurIPS 2022 and ICLR 2023 Travel Support
  • Degree Awards: SJTU Best Bachelor’s Thesis (1%) 2020, SJTU Outstanding Graduate 2022/2023
  • Scholarship: China National Scholarship (0.2%) 2021/2022, Kwang-Hua Scholarship 2019, Arawana Scholarship 2017

Misc

  • I practice neatness and minimalism.
  • I love to sing, attend concerts, photograph, hike, ski, play badminton and table tennis.
  • I write blogs (in Chinese yet) about my thoughts and experience.
  • My Erdös number is 3 due to my collaboration with Chuan Guo.