Introduction
The proliferation of generative AI has introduced a new attack surface: the "skill" ecosystem. As organizations rush to integrate autonomous agents and skill-based extensions, public marketplaces are being flooded with malicious capabilities designed to steal credentials, exfiltrate sensitive data, and hijack agent sessions. In response, vendors released "skill scanners"—static analysis tools promising to detect malicious code before installation. However, recent testing by Trail of Bits demonstrates these tools are fundamentally broken. We have confirmed that scanners from ClawHub, Cisco, and skills.sh can be bypassed in under an hour using basic obfuscation and prompt injection techniques. Defenders can currently rely on no automated scanner to vet these skills. You are flying blind.
Technical Analysis
Affected Products and Platforms
This vulnerability in the supply chain affects multiple major distribution platforms and scanning engines:
- ClawHub: A major public skill marketplace and its integrated scanner.
- Cisco: The "Agent Skill Scanner" utilized within Cisco's security ecosystem.
- Skills.sh: All three of the scanners integrated into this platform were bypassed.
Vulnerability and Exploitation Mechanism
The issue is not a specific CVE, but a systemic failure in static analysis logic regarding the interpretation of natural language and code within AI skill packages (often Python or JavaScript wrappers).
- Attack Vector: Malicious Skills (Supply Chain Compromise).
- Technique: Bypasses were achieved using "standard tricks" and "prompt injection." By embedding malicious payloads within data structures or natural language prompts that the scanning engine failed to parse as executable logic, attackers successfully hid:
- Credential Theft: Code capable of accessing host environment variables or credential managers.
- Data Exfiltration: Callback mechanisms to external C2 servers.
- Agent Hijacking: Instructions to override system prompt constraints.
- Ease of Exploitation: Researchers required less than one hour to conceive and implement three of the four proof-of-concept malicious skills. The fourth took only a few hours due to prompt injection complexity.
- Source of Truth: The PoCs are available in the
trailofbits/overtly-malicious-skillsrepository.
Exploitation Status
- Status: Confirmed Bypass (Public Proof of Concept).
- Active Exploitation: While specific in-the-wild campaigns are not detailed in the report, the low barrier to entry and availability of PoCs make immediate active exploitation highly likely.
Detection & Response
Because the scanners designed to prevent this are ineffective, organizations must detect malicious activity at execution time. Since skills often run as processes or scripts within an agent framework, we must monitor for unexpected behaviors initiated by these agent processes.
SIGMA Rules
The following rules focus on the behavioral outcomes of malicious skills: credential access and unauthorized network connections.
---
title: Potential Malicious Agent - Credential Access
id: 8a4f2c1d-5e6b-4f3c-9a1b-2c3d4e5f6a7b
status: experimental
description: Detects agent or skill processes accessing credential stores or sensitive environment files, indicative of a malicious skill attempting credential theft.
references:
- https://blog.trailofbits.com/2026/06/03/the-sorry-state-of-skill-distribution/
author: Security Arsenal
date: 2026/06/04
tags:
- attack.credential_access
- attack.t1552.001
logsource:
category: process_access
product: windows
detection:
selection:
SourceImage|contains:
- 'python'
- 'node'
- 'agent'
TargetImage|contains:
- 'lsass.exe'
- 'sam'
- 'system'
condition: selection
falsepositives:
- Legitimate administrative tools or password managers accessing credentials
level: high
---
title: Potential Malicious Agent - Data Exfiltration
id: 9b5g3d2e-6f7c-0g4d-0b2c-3d4e5f6a7b8c
status: experimental
description: Detects agent processes (often python/node based) initiating outbound network connections to non-standard ports or suspicious external endpoints, a common trait in skills attempting data exfiltration.
references:
- https://blog.trailofbits.com/2026/06/03/the-sorry-state-of-skill-distribution/
author: Security Arsenal
date: 2026/06/04
tags:
- attack.exfiltration
- attack.t1041
logsource:
category: network_connection
product: windows
detection:
selection:
Image|contains:
- 'python.exe'
- 'node.exe'
DestinationPort|notin:
- 80
- 443
- 8080
condition: selection
falsepositives:
- Developers testing code on non-standard ports
- Internal application communication
level: medium
KQL (Microsoft Sentinel / Defender)
This hunt query identifies network connections spawned by common agent runtime processes that are not communicating with known trusted Microsoft or CDN endpoints, or are connecting to recently registered domains.
let AgentProcesses = dynamic(['python.exe', 'python3.exe', 'node.exe', 'java.exe']);
DeviceNetworkEvents
| where InitiatingProcessFileName in~ AgentProcesses
| where Timestamp > ago(7d)
| where RemoteUrl !contains "microsoft.com"
and RemoteUrl !contains "azure.com"
and RemoteUrl !contains "github.com"
| summarize Count = count(), TimeSeen = min(Timestamp), FirstSeen = min(Timestamp), LastSeen = max(Timestamp) by DeviceName, InitiatingProcessFileName, InitiatingProcessCommandLine, RemoteUrl, RemotePort
| order by Count desc
Velociraptor VQL
This artifact hunts for the presence of skill files (often JSON or YAML manifests combined with code) that contain known suspicious functions used in the PoC (like os.system, subprocess, or exec) within typical skill directories.
-- Hunt for suspicious function calls in skill directories
SELECT FullPath, Size, Mtime
FROM glob(globs='/**/skills/**/*.py')
WHERE read_file(filename=FullPath) =~ '(os\.system|subprocess\.Popen|eval\(|exec\()'
OR FullPath =~ 'overtly-malicious'
Remediation Script (Bash)
Use this script to audit Linux-based agent environments for the presence of the known malicious repository or suspicious skill characteristics. Note: This is a temporary measure until vendor scanners are patched.
#!/bin/bash
# Audit for malicious skills in common agent directories
# Usage: sudo ./audit_skills.sh
echo "[+] Initiating Skill Security Audit..."
# Define directories to scan (customize based on your agent deployment)
SCAN_DIRS=("/opt/agents" "/home/*/.skills" "/var/www/skills" "/usr/local/lib/skills")
# Check for the known malicious repo name or indicators
MALICIOUS_INDICATORS=("overtly-malicious" "steal_credentials" "exfiltrate_data")
FOUND=0
for dir in "${SCAN_DIRS[@]}"; do
if [ -d "$dir" ]; then
echo "[+] Scanning directory: $dir"
# Search for suspicious keywords in Python/JS files
grep -rnw "${dir}" -e "import os" -e "import subprocess" -e "requests.post" | grep -i "password\|token\|exfil" && FOUND=1
# Check for known repo names
for indicator in "${MALICIOUS_INDICATORS[@]}"; do
if find "$dir" -type d -name "*$indicator*" 2>/dev/null; then
echo "[!] CRITICAL: Found directory matching malicious indicator: $indicator"
FOUND=1
fi
done
fi
done
if [ "$FOUND" -eq 0 ]; then
echo "[+] No obvious malicious skill indicators found in standard paths."
echo "[!] WARNING: Static analysis is unreliable. Manual code review is mandatory."
else
echo "[!] ALERT: Suspicious files or directories detected. Investigate immediately."
exit 1
fi
Remediation
The current remediation landscape is difficult because the primary defensive control (the scanner) is ineffective. Immediate action is required to compensate:
-
Disable Automatic Installation: Immediately disable any auto-install or auto-update features for agent skills in ClawHub, Cisco, or skills.sh integrations. Move to a manual approval workflow.
-
Manual Code Review: Until scanners are patched, every skill must undergo manual Line-by-Line review by a senior security engineer. Pay specific attention to:
importstatements (e.g.,os,subprocess,socket,requests).- Natural language strings within prompts that attempt to manipulate system instructions.
-
Network Egress Filtering: Apply strict egress filtering to the hosts or containers running AI agents. They should only be able to communicate with known, necessary API endpoints. Block general internet access to prevent C2 callbacks.
-
Runtime Isolation: Run agents in isolated environments (e.g., dedicated VPCs, strict containers) with zero trust networking. Do not run agent processes with service account credentials that have access to production data.
-
Vendor Engagement: Contact your representatives at Cisco and ClawHub to demand a timeline for patching the static analysis bypasses demonstrated by Trail of Bits. Reference the
trailofbits/overtly-malicious-skillsrepository as the test case they must detect.
Related Resources
Security Arsenal Incident Response Services AlertMonitor Platform Book a SOC Assessment incident-response Intel Hub
Is your security operations ready?
Get a free SOC assessment or see how AlertMonitor cuts through alert noise with automated triage.