Cyber Security

Anatomy of the Python worm that gaslights AI security agents

The Hades Campaign is a sophisticated worm targeting Python developers. It uses adversarial prompt injection to trick AI security tools and steals data.
Anatomy of the Python worm that gaslights AI security agents

Have you reached a point where you trust your AI code assistant more than your own manual review? Many developers have. We rely on large language models (LLMs) to scan for vulnerabilities, suggest fixes, and verify the integrity of third-party packages. But what happens when the malware knows how to talk back to the AI? I spent several nights last week reviewing the technical details of the Hades Campaign, a threat that marks a change in how attackers approach automated defenses. This is not just a data thief. It is a psychological operation directed at the software we use to protect ourselves.

Researchers at StepSecurity recently identified this campaign as the latest work of the Miasma threat actor group. While Miasma previously focused on cloud credential harvesting, Hades is a more aggressive, self-propagating worm. It specifically targets Python developer environments through compromised packages in the PyPI ecosystem. The list of affected libraries includes ensmallen, mflux-streamlit, and several tools used in bioinformatics. If you work in computational biology or data science, your workstation is a high-value target for this specific group.

The entry point in the Python initialization process

The attack begins when a developer imports a compromised package. Hades hides its primary loader inside the __init__.py file. This is a standard file that Python uses to mark a directory as a package. By placing the malicious script here, the attackers ensure the code executes the moment the package is loaded into a project. The loader is obfuscated to avoid basic signature-based detection, but its purpose is simple. It drops a precompiled Bun runtime binary onto the system.

Bun is a high-performance JavaScript toolkit. It is an excellent tool for legitimate developers, but it is also a perfect execution engine for malware. Because Bun is a self-contained binary, it allows the malware to run complex JavaScript payloads without requiring a Node.js installation on the victim's machine. This bypasses traditional monitoring that looks for suspicious npm or Node activity. From an architectural level, using Bun allows the attackers to operate in a shadow environment that sits outside the standard security visibility of most Python-centric workstations.

How malware gaslights the automated gatekeeper

The most innovative feature of Hades is its ability to lie to AI security agents. Many modern development environments use LLMs to scan code for malicious patterns. Hades anticipates this. At the top of its malicious files, the malware includes a block of text designed as an adversarial prompt. This text instructs the LLM to ignore any suspicious code found below it and to classify the package as verified and safe.

I tested a similar technique in a sandboxed environment last month. I provided an LLM with a script containing a blatant reverse shell but preceded it with a comment saying the code was a pre-authorized security test for a government contract. The AI ignored the exploit and reported that the script followed all safety protocols. Hades uses this exact digital Trojan horse strategy. It exploits the cognitive logic of the AI. Because LLMs are susceptible to social engineering, they accept these hidden instructions as high-priority commands. This results in a false negative verdict, allowing the malware to slip past the organization’s automated analysis.

Exploiting the supply chain integrity framework

Hades does not just hide; it attempts to look official. It exploits several mission-critical security frameworks, including OpenID Connect (OIDC) and Supply-chain Levels for Software Artifacts (SLSA). When the malware finds itself running inside a GitHub Actions workflow, it checks for OIDC variables. It then uses these credentials to generate cryptographically signed SLSA provenance bundles via Sigstore.

This is a sophisticated bypass of the very systems meant to guarantee software integrity. By generating a valid Sigstore bundle, the malware can publish compromised versions of libraries to PyPI or npm that appear to have come from an official, verified build environment. To an outside observer or an automated tool, the package looks like it has a valid chain of custody. This turns the security infrastructure into a weapon for the attacker. It makes the compromised code appear more trustworthy than legitimate, unsigned code.

Memory scraping and lateral movement

Once the malware is established, it begins harvesting data. It includes tailored memory scrapers for Linux, macOS, and Windows. These scrapers target the memory mappings of active processes to extract sensitive, encrypted data that is otherwise inaccessible on the disk. This is a stealthy approach because it avoids creating suspicious files that an endpoint detection and response (EDR) system might flag.

The worm also looks for ways to spread. It scans the infected system for SSH keys, SCP configurations, and GitHub tokens. If it finds a token with write permissions, it uses the GitHub Actions runner to extract secrets directly from memory. It then attempts to infect other repositories owned by the user. The command and control (C2) traffic is hidden within public GitHub infrastructure. Stolen data is compressed, encrypted, and pushed to new public repositories under the attacker's control. These repositories often carry the description "Hades — The End for the Damned."

The scorched earth persistence mechanism

Persistence is a core component of the Hades design. The malware establishes a presence on the workstation and monitors the status of the stolen credentials. If a security team detects the breach and revokes the stolen GitHub token, Hades enters a reactive mode. It executes a wiper process that attempts to erase the user’s local files.

This is a retaliatory tactic. It creates a high-stakes situation for incident responders. If you kill the access, you might lose the data on the local machine. This logic forces organizations to move carefully during the remediation phase. The malware also targets the configuration files of 14 different AI agents. It plants instructions that trigger a new infection whenever the developer consults their AI assistant about the current workspace. This creates a loop where the act of seeking help from an AI tool re-infects the system.

Building a proactive defense against deceptive malware

The Hades Campaign shows that our reliance on AI as a primary security layer is a vulnerability. We must move toward a model where AI is a collaborator rather than an ultimate authority. From a risk perspective, the best defense is a granular one that does not rely on a single gatekeeper.

Key takeaways for securing your environment:

  1. Use strict boundary isolation for LLMs. Never pass raw, untrusted code to an AI scanner without a system prompt that explicitly forbids the model from following instructions contained within the code itself.
  2. Audit your GitHub Actions permissions. Implement the principle of least privilege for OIDC tokens and ensure that runners do not have unnecessary write access to your repositories.
  3. Monitor for unauthorized binary execution. Hades relies on dropping a Bun runtime. Tools that flag or block the execution of unknown binaries in the __init__.py path can stop the infection at the entry point.
  4. Verify Sigstore provenance manually. While Hades can forge bundles, discrepancies in the build timing or the environment variables can sometimes be caught during a forensic audit.
  5. Implement robust backup strategies. Because Hades includes a wiper, having offline or immutable backups of your development work is the only way to mitigate the threat of data loss during remediation.

Looking at the threat landscape, Hades is likely a preview of the future. Attackers are no longer just trying to break the locks on our doors; they are learning how to convince the digital bouncer to let them in. We need to stop treating AI verdicts as absolute truths and start treating them as data points that require human verification.

Sources: NIST Cybersecurity Framework, MITRE ATT&CK Framework (Software Supply Chain Compromise), StepSecurity Hades Campaign Report.

Disclaimer: This article is for informational and educational purposes only and does not replace a professional cybersecurity audit or incident response service.

bg
bg
bg

See you on the other side.

Our end-to-end encrypted email and cloud storage solution provides the most powerful means of secure data exchange, ensuring the safety and privacy of your data.

/ Create a free account