- Published on
LLM-based applications face critical security challenges in form of prompt injections and jailbreaks. This article dives into the key architectural improvements underpinning ModernBERT, and demonstrates how to fine-tune it for discriminating malicious prompts. Our model closely approximates the performance of Claude 3.7 and Gemini Flash 2.0 on a mixed benchmark (NotInject, BIPIA, Wildguard-Benign, and PINT), while maintaining low latency (<40ms).