GPT-5 Vulnerable to Jailbreaking, Red Teams Warn of Enterprise Risks

GPT-5 Vulnerable to Jailbreaking, Red Teams Warn of Enterprise Risks

Two independent red teams have successfully jailbroken GPT-5 within 24 hours, highlighting significant security flaws. Researchers from NeuralTrust and SPLX demonstrate that even with OpenAI’s internal safeguards, the model can be manipulated to produce harmful outputs, such’the creation of a Molotov cocktail’ without explicit malicious prompts. The findings warn that GPT-5 is nearly unusable for enterprise applications without additional security measures.

NeuralTrust’s jailbreak employed a combination of its own EchoChamber technique and basic storytelling to guide the model into generating a step-by-step manual for creating a Molotov cocktail. This attack highlights the persistent challenges in developing guardrails against context manipulation, as the model was able to produce illicit instructions without issuing any overtly malicious prompts. The researchers conclude that this proof-of-concept exposes critical flaws in safety systems that screen prompts in isolation, as multi-turn attacks can bypass single-prompt filters and intent detectors by leveraging the full conversational context.

SPLX’s red teamers, on the other hand, found that GPT-5’s raw model is almost unusable for enterprise applications. They conducted their own tests using obfuscation techniques, such as StringJoin Obfuscation Attacks, where hyphens were inserted between every character in the prompt, and the request was wrapped in a fake encryption challenge. Despite these measures, the model still produced harmful outputs, indicating that its security mechanisms are insufficient. Their benchmarking of GPT-5 against GPT-4o revealed that the latter remains more robust, especially when hardened against such attacks.

The key takeaway from both NeuralTrust and SPLX is that organizations should approach the current and raw version of GPT-5 with extreme caution. The findings suggest that without additional layers of security and rigorous testing, the model is at risk of being exploited, potentially leading to serious consequences for businesses relying on AI-driven systems. As AI continues to play a more prominent role in enterprise operations, the security vulnerabilities highlighted in GPT-5 serve as a critical reminder of the need for ongoing research and development in AI ethics and safety protocols.