Release notes: Aug 27, 2025 (v.8.280.1)

This is a SaaS-only release.

Inference Red-Team

August attack pack: More prompts, more power

At CalypsoAI, we’re continuing to push the boundaries of AI security. Our August Signature attack pack was created and deployed end-to-end by AI agents expanding coverage and delivering even stronger results.

This month’s release is stronger than ever, with 11,500+ new adversarial prompts added across a wide range of attack families, like like MathPrompt, DAN, and payload splitting. In this release we’ve added FlipAttack, a new vector that targets the predictive, step-by-step reasoning process of LLMs. By flipping word order, characters, or even entire sentences and then instructing the model to “denoise” the text, attackers can disguise harmful requests and bypass safeguards.

Together, these attacks are uncovering deeper vulnerabilities and incorporating the latest threat intelligence from the field.

NOTE: With our next SaaS release the attack pack names will be re-labeled, to their respective subsequent month to align with the CASI leaderboard. The change will impact all previous attack packs, e.g., “July Attack Pack” will become “August attack pack”, “August Attack Pack” will become “September Attack Pack”, and so on.

Stronger operational coverage with TLS redo

We’ve reworked our TLS operational attack to support additional security header checks and to reduce noise in results by lowering the minimum recommended version to TLS 1.2. This update improves accuracy while broadening coverage of TLS-related vulnerabilities.

Major reduction in false positives for attack evaluations

We’ve introduced a new refusal checker capability that delivers more accurate and consistent evaluations of model responses. This feature has been redesigned with two key capabilities:

Response-only evaluation – The checker now evaluates refusals based solely on the model’s response, not the attack prompt. This ensures consistency and avoids confusion caused by the adversarial prompts themselves.
Error-aware detection – The checker now recognizes error messages returned as model outputs (e.g., guardrail rejections, networking errors, or system refusals) and correctly treats them as refusals.

This new feature ensures more accurate evaluation of LLM behavior across attack scenarios, particularly when guardrails are in use.

Inference Defend

Access control clarity

We’ve improved the messaging associated with the scanner access control so it’s easier to understand which projects have access to which scanners. Users now have three options for configuring scanner access control:

All projects toggle: this makes the scanner available to all current and future projects.
Select all checkbox: this makes the scanner available to all current projects, but not future ones.
Multi-select project checkboxes: this makes the scanner available to only the selected projects.

access-control-1

To configure scanner access, navigate to Scanners, click the three-dots menu next to the scanner or scanner package, and select Access control.

Bug fixes

The "All Attacks" campaign was missing from the Run attack feature. Resolution: Fixed.
A translation key was missing for the Canceling status in the filter. Resolution: Fixed.
The size select was off on the reports view. Resolution: Fixed.
Single-turn agentic attacks were appearing in the dropdown. Resolution: Fixed.
The content area and side panel in the fingerprints empty state had different heights. Resolution: Fixed.
The animation logo in Outcome Analysis did not behave correctly with long text. Resolution: Fixed.d.
Pressing Next row at the end of prompt logs did not wait for the next page. Resolution: Fixed.
The Discard changes modal appeared incorrectly after saving a new scanner version. Resolution: Fixed.
The Load more button in custom scanners testing panel loaded duplicates. Resolution: Fixed.
There was an extra apostrophe in the Save new version modal. Resolution: Fixed.
Drop shadows on side panels, filters, and dropdowns were inconsistent. Resolution: Fixed.
The scanner name in blocked chat bubbles was not linking to edit/version history. Resolution: Fixed.
Agentic fingerprints was incorrectly named “Agentic attacks”. Resolution: Fixed.