Cybersecurity News Hub
No Result
View All Result
  • Home
  • Cyber Crime
  • Cyber Security
  • Data Breach
  • Mobile Security
  • Videos
  • Advertise
  • Privacy Policy
  • Contact Us
  • Home
  • Cyber Crime
  • Cyber Security
  • Data Breach
  • Mobile Security
  • Videos
  • Advertise
  • Privacy Policy
  • Contact Us
No Result
View All Result
Cybersecurity News Hub
No Result
View All Result
Home Cyber Security

EchoGram Flaw Bypasses Guardrails in Major LLMs – Hackread – Cybersecurity News, Data Breaches, Tech, AI, Crypto and More

Cyberinchief by Cyberinchief
November 17, 2025
Reading Time: 3 mins read
0
EchoGram Flaw Bypasses Guardrails in Major LLMs – Hackread – Cybersecurity News, Data Breaches, Tech, AI, Crypto and More


New research from the AI security firm HiddenLayer has exposed a vulnerability in the safety systems of today’s most popular Large Language Models (LLMs) like GPT-5.1, Claude, and Gemini. This flaw, discovered in early 2025 and dubbed EchoGram, allows simple, specially chosen words or code sequences to completely trick the automated defences, or guardrails, meant to keep the AI safe.

RELATED POSTS

How Russia’s Largest Private University is Linked to a $25M Essay Mill – Krebs on Security

Malicious Go Packages Impersonate Google’s UUID Library to Steal Sensitive Data

Warning: React2Shell vulnerability already being exploited by threat actors

What is EchoGram and How Does it Work?

For your information, LLMs are protected by guardrails, which are basically filter systems designed to spot and block harmful requests, like asking the AI to give out secret information (Alignment Bypasses) or forcing it to ignore its rules (Task Redirection, also called Prompt Injection). These guardrails usually work in one of these two ways: a separate AI model judging the request (LLM-as-a-judge) or a simple text-checking system (classification model).

EchoGram targets Guardrails

The EchoGram attack works by taking advantage of how both types of guardrail models are trained. Both learn what’s safe and unsafe from large collections of past data. The technique starts by creating a wordlist, a specific collection of words and symbols, which is then used to find sequences (which the researchers call flip tokens) that are missing or unbalanced in the training data.

These flip tokens are often nonsensical, i-e, they pass through the defence layer without changing the original malicious request the main AI sees. By using a flip token, an attacker can make the defence system change its mind, or ‘flip the verdict.’

For example, when HiddenLayer researchers were testing an older version of their own defence system, a malicious command was approved when a random string “=coffee” was simply added to the end.

EchoGram prompt working on gpt-4o (Image credit: HiddenLayer)

More Than Just Letting Malicious Requests Through

Further probing revealed that this technique can be used in two damaging ways. First, an attacker can slip a truly malicious request past the defences. Second, they can do the opposite: they can take a completely harmless request and craft it so the guardrail incorrectly flags it as dangerous.

This ability to cause false alarms can be just as harmful. If security teams are constantly flooded with incorrect warnings, they could lose faith in the system’s accuracy, a problem HiddenLayer researchers Kasimir Schulz and Kenneth Yeung refer to as “alert fatigue,” in the blog post shared with Hackread.com.

Buy JNews
ADVERTISEMENT

 It is worth noting that combining multiple flip tokens can make an attack even stronger. The team estimates developers have only a ~3-month defensive head start before attackers can copy this method, making immediate changes critical as AI integration into fields like finance and healthcare becomes faster.

(Photo by Mohamed Nohassi on Unsplash)





Source link

Tags: BreachesBypassescryptocybersecuritydataEchoGramFlawGuardrailsHackreadLLMsMajorNewstech
ShareTweetPin
Cyberinchief

Cyberinchief

Related Posts

How Russia’s Largest Private University is Linked to a $25M Essay Mill – Krebs on Security
Cyber Security

How Russia’s Largest Private University is Linked to a $25M Essay Mill – Krebs on Security

December 8, 2025
Malicious Go Packages Impersonate Google’s UUID Library to Steal Sensitive Data
Cyber Security

Malicious Go Packages Impersonate Google’s UUID Library to Steal Sensitive Data

December 8, 2025
Warning: React2Shell vulnerability already being exploited by threat actors
Cyber Security

Warning: React2Shell vulnerability already being exploited by threat actors

December 7, 2025
News brief: RCE flaws persist as top cybersecurity threat
Cyber Security

News brief: RCE flaws persist as top cybersecurity threat

December 7, 2025
Barts Health NHS Confirms Cl0p Ransomware Behind Data Breach – Hackread – Cybersecurity News, Data Breaches, Tech, AI, Crypto and More
Cyber Security

Barts Health NHS Confirms Cl0p Ransomware Behind Data Breach – Hackread – Cybersecurity News, Data Breaches, Tech, AI, Crypto and More

December 6, 2025
GOLD BLADE’s strategic evolution – Sophos News
Cyber Security

GOLD BLADE’s strategic evolution – Sophos News

December 6, 2025
Next Post
MSc Cyber Security at Northumbria University London | Student Story from Vijayaraj

MSc Cyber Security at Northumbria University London | Student Story from Vijayaraj

Cyber Crime Law | Cyber crime in Nepal | How to report cyber crime? |

Cyber Crime Law | Cyber crime in Nepal | How to report cyber crime? |

Recommended Stories

🔥Salary of Cyber Security Engineer | How Much does a Cyber Security Engineer Make #Simplilearn

🔥Salary of Cyber Security Engineer | How Much does a Cyber Security Engineer Make #Simplilearn

October 14, 2025
how to complain in cyber police with application format?  | cyber crime online complaint in nepal

how to complain in cyber police with application format? | cyber crime online complaint in nepal

November 23, 2025
Quantum key distribution method tested in urban infrastructure offers secure communications – Lifeboat News: The Blog

Quantum key distribution method tested in urban infrastructure offers secure communications – Lifeboat News: The Blog

October 9, 2025

Popular Stories

  • Allianz Life – 1,115,061 breached accounts

    Allianz Life – 1,115,061 breached accounts

    0 shares
    Share 0 Tweet 0
  • Prosper – 17,605,276 breached accounts

    0 shares
    Share 0 Tweet 0
  • साइबर अपराध | Illegal Payment Gateway & Rented Bank Accounts | MAMTA CHOPRA

    0 shares
    Share 0 Tweet 0
  • Miljödata – 870,108 breached accounts

    0 shares
    Share 0 Tweet 0
  • Snowflake Data Breach Explained: Lessons and Protection Strategies

    0 shares
    Share 0 Tweet 0

Search

No Result
View All Result

Recent Posts

  • Top 5 Mobile App Security Threats Leaders Must Prepare for in 2026
  • Microsoft On Women In Cybersecurity At Black Hat Europe 2025 In London
  • Polisi kembali ungkap sindikat Cyber Crime kejahatan Internasional – iNews Malam 09/03

Categories

  • Cyber Crime
  • Cyber Security
  • Data Breach
  • Mobile Security
  • Videos

Newsletter

© 2025 All rights reserved by cyberinchief.com

No Result
View All Result
  • Home
  • Cyber Crime
  • Cyber Security
  • Data Breach
  • Mobile Security
  • Videos
  • Advertise
  • Privacy Policy
  • Contact Us

© 2025 All rights reserved by cyberinchief.com

Newsletter Signup

Subscribe to our weekly newsletter below and never miss the latest News.

Enter your email address

Thanks, I’m not interested