Edoardo Debenedetti

Edoardo Debenedetti

Computer Science PhD Student

ETH Zürich

SPY Lab

I am a Computer Science Ph.D. Student at ETH Zürich in the Secure and Private AI (SPY) Lab, advised by Florian Tramèr. I am also a Student Researcher in the ML Red Team at Google, working with Tianqi Fan and Ilia Shumailov. I am interested in the security and privacy of machine learning systems. Most recently, I have been looking into the security of LLM Agents.

My research is supported by a CYD Doctoral Fellowship awarded by the armasuisse Cyber-Defence Campus.

Prior to my PhD, I earned a Computer Science M.Sc. at EPFL and a Computer Engineering B.Sc. at the Polytechnic University of Turin.

Outside of my studies, I interned as a SWE intern at Bloomberg LP and as a Research Intern at the armasuisse CYD Campus, supervised by Prof. Humbert.

More information can be found on my CV, last updated on 2024/10/20. For a more recent one, feel free to reach out via email.

In my freetime, I like all things outdoors, from hiking, to skiing, sailing, and climbing.

Interests
  • Security of Large Language Model Agents
  • Real-world Machine Learning evasion attacks
  • Privacy in Machine Learning
Education
  • Ph.D. in Computer Science, 2022 - Ongoing

    ETH Zürich - Swiss Federal Institute of Technology, Zürich, Switzerland 🇨🇭

  • M.Sc. in Computer Science, 2019 - 2022

    EPFL - Swiss Federal Institute of Technology, Lausanne, Switzerland 🇨🇭

  • B.Sc. in Computer Engineering, 2016 - 2019

    PoliTo - Politecnico di Torino, Italy 🇮🇹

News

[10/2024 - Internship] On October 1st I started as Student Researcher in the ML Red Team at Google. I will be working with Tianqi Fan and Ilia Shumailov (Google DeepMind) on AI agents security.

[09/2024 - Spotlight] The report of our SaTML LLMs CTF has been accepted as spotlight at the NeurIPS D&B Track 2024! Also AgentDojo and JailbreakBench were accepted.

[04/2024 - Award] Evading Black-box Classifiers Without Breaking Eggs, selected as Distinguished Paper Award Runner-up at IEEE SaTML 2024!

Publications

(2024). AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents. NeurIPS D&B 2024.

PDF Cite Code Project

(2024). Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition. NeurIPS D&B 2024 (Spotlight).

PDF Cite Code Dataset Project

(2024). JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models. NeurIPS D&B 2024.

PDF Cite Code Dataset Project