Hugging Face \"Privacy Filter\" Impersonation: Infostealer Detection and Remediation

A malicious repository masquerading as OpenAI's "Privacy Filter" project recently infiltrated the trending list on Hugging Face, a widely trusted platform for machine learning models and code. This repository was engineered to deliver information-stealing malware (infostealer) specifically targeting Windows users.

For defenders, this represents a critical evolution in supply-chain attacks: the exploitation of developer-centric hubs. Unlike traditional phishing, this attack preys on the trust placed in open-source communities and the urgency to integrate "trending" AI tools. The urgency is high because the payload is an infostealer, designed to siphon session tokens, credentials, and crypto-wallet data immediately upon execution.

Technical Analysis

Affected Platforms: Windows endpoints. Attack Vector: Supply-chain compromise (Typosquatting/Impersonation). Threat: Information Stealer (Infostealer) delivered via malicious Python/Batch scripts.

Attack Chain Breakdown:

Lure: Attackers create a repository on Hugging Face impersonating a legitimate project (e.g., "Privacy Filter"). They often include plausible README files and code structures to appear authentic.
Delivery: Users discover the repo via trending lists or search results and download the artifacts (ZIP files or clone the repo) to their local machines.
Execution: The victim manually executes the script (e.g., python main.py or run.bat), often to "test" the model.
Payload Deployment: The script acts as a dropper, retrieving and executing a binary or PowerShell script on the local system.
Exfiltration: The infostealer harvests sensitive data (browser cookies, passwords, cryptocurrency wallets) and exfiltrates it to a Command and Control (C2) server.

Exploitation Status: Confirmed active exploitation. The repository reached the platform's trending list, significantly increasing the victim pool before removal.

Detection & Response

━━━ DETECTION CONTENT ━━━

Sigma Rules

The following rules detect the behavioral indicators of this attack: suspicious execution from download directories (common when testing new tools) and the hallmark behavior of infostealers accessing browser databases.

YAML

---
title: Suspicious Python Execution from User Downloads
id: 9a8b7c6d-5e4f-3a2b-1c0d-9e8f7a6b5c4d
status: experimental
description: Detects execution of Python scripts from the User Downloads directory, a common location for manually cloned or downloaded repositories containing malware like the Hugging Face infostealer.
references:
  - https://www.bleepingcomputer.com/news/security/fake-openai-repository-on-hugging-face-pushes-infostealer-malware/
author: Security Arsenal
date: 2024/05/23
tags:
  - attack.execution
  - attack.t1059.006
logsource:
  category: process_creation
  product: windows
detection:
  selection:
    Image|endswith: '\\python.exe'
    CommandLine|contains: '.py'
  filter_legit_dirs:
    ParentImage|contains:
      - '\\Visual Studio\'
      - '\\JetBrains\'
      - '\\GitHub Desktop\'
    CurrentDirectory|contains:
      - '\\Downloads\'
      - '\\Desktop\'
  condition: selection and filter_legit_dirs
falsepositives:
  - Developers testing legitimate scripts from Downloads (requires tuning)
level: medium
---
title: Infostealer Browser Data Access
id: 1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d
status: experimental
description: Detects non-browser processes accessing browser database files (Cookies, Login Data), a key TTP of infostealers distributed via fake repositories.
references:
  - https://attack.mitre.org/techniques/T1005/
author: Security Arsenal
date: 2024/05/23
tags:
  - attack.collection
  - attack.t1005
logsource:
  category: file_access
  product: windows
detection:
  selection:
    TargetFilename|contains:
      - '\\Google\\Chrome\\User Data\\Default\\Cookies'
      - '\\Google\\Chrome\\User Data\\Default\\Login Data'
      - '\\Mozilla\\Firefox\\Profiles\\cookies.sqlite'
      - '\\Mozilla\\Firefox\\Profiles\\logins.'
  filter_browsers:
    Image|contains:
      - '\\chrome.exe'
      - '\\msedge.exe'
      - '\firefox.exe'
  condition: selection and not filter_browsers
falsepositives:
  - Legitimate password managers or backup utilities
level: high


**KQL (Microsoft Sentinel / Defender)**

Hunt for processes spawned from common download locations and network connections to suspicious endpoints often associated with malware delivery. Additionally, hunt for the specific file names often used in these campaigns (e.g., privacy-filter.py, setup.py in Downloads).

KQL — Microsoft Sentinel / Defender

// Hunt for script execution from Downloads folder correlating with network activity
let TimeFrame = 1d;
DeviceProcessEvents
| where Timestamp > ago(TimeFrame)
| whereFolderPath contains \"\\\Downloads\\\" 
or FolderPath contains \"\\\Desktop\\\"
| where ProcessVersionInfoOriginalFileName in (\"python.exe\", \"pythonw.exe\", \"cmd.exe\", \"powershell.exe\", \"pwsh.exe\")
| project Timestamp, DeviceName, AccountName, FolderPath, ProcessCommandLine, InitiatingProcessFileName
| join kind=inner (
    DeviceNetworkEvents 
    | where Timestamp > ago(TimeFrame) 
    | where RemoteUrl contains \"huggingface.co\" or RemotePort < 1024 // Check for generic outbound C2 or specific source
) on DeviceName, Timestamp
| project Timestamp, DeviceName, AccountName, ProcessCommandLine, RemoteUrl, RemoteIP


**Velociraptor VQL**

Hunt endpoints for the presence of the malicious artifact or its execution remnants. This query looks for Python scripts in user download directories modified recently and checks for processes accessing browser SQLite databases.

VQL — Velociraptor

-- Hunt for suspicious Python scripts in Downloads
SELECT FullPath, Mtime, Atime, Size, Mode
FROM glob(globs=\"*/Users/*/Downloads/*.py\")
WHERE Mtime > now() - 7d
   AND Size < 5000000 // Generally small scripts

-- Hunt for processes accessing browser cookies
SELECT Pid, Name, CommandLine, Exe
FROM pslist()
WHERE Exe NOT =~ \"chrome.exe\" 
  AND Exe NOT =~ \"firefox.exe\"
  AND Exe NOT =~ \"msedge.exe\"
  AND Exe NOT =~ \"svchost.exe\"
// Correlate with open files would require chain() or specific artifacts, 
// this is a heuristic for the running process context.


**Remediation Script (PowerShell)**

Use this script to scan user profiles for common indicators of this specific compromise (python scripts in Downloads with recent modification times) and kill any non-browser processes holding handles to browser data.

PowerShell

# Remediation Script for Hugging Face Infostealer Indicators

# 1. Identify recent Python scripts in Downloads (Potential Droppers)
$SuspiciousFiles = Get-ChildItem -Path \"C:\Users\*\Downloads\" -Filter \"*.py\" -Recurse -ErrorAction SilentlyContinue | 
    Where-Object { $_.LastWriteTime -gt (Get-Date).AddDays(-7) }

if ($SuspiciousFiles) {
    Write-Host \"[!] Found suspicious Python scripts in Downloads:\" -ForegroundColor Red
    $SuspiciousFiles | Format-Table FullName, LastWriteTime
    # Optional: Move to quarantine or delete (Uncomment to enforce)
    # $SuspiciousFiles | Remove-Item -Force
} else {
    Write-Host \"[-] No recent Python scripts found in standard Downloads folders.\" -ForegroundColor Green
}

# 2. Identify non-browser processes accessing Chrome/Firefox databases
# Note: Requires Administrative privileges
$Browsers = @(\"chrome.exe\", \"firefox.exe\", \"msedge.exe\", \"opera.exe\")
$SuspiciousProcs = Get-WmiObject Win32_Process | Where-Object { 
    $Browsers -notcontains $_.Name 
} 

# Advanced logic would use handle.exe (Sysinternals) to check for specific file handles.
# Below is a basic heuristic for processes named common infostealer identifiers.
$KnownInfostealerNames = @(\"python.exe\", \"pythonw.exe\", \"cmd.exe\", \"powershell.exe\", \"rundll32.exe\")

foreach ($Proc in $SuspiciousProcs) {
    if ($KnownInfostealerNames -contains $Proc.Name) {
        Write-Host \"[+] Reviewing non-browser process: $($Proc.Name) (PID: $($Proc.ProcessId))\" -ForegroundColor Yellow
    }
}

Write-Host \"[*] Review identified processes and terminate if confirmed malicious.\"

Remediation

Identify and Isolate Affected Hosts: Use the detection rules above to identify machines that executed scripts from the Downloads directory or showed infostealer file access patterns. Isolate these endpoints from the network immediately to prevent further exfiltration.
Block Malicious Repositories: If your organization uses a web proxy or secure web gateway (SWG), block access to the specific Hugging Face repository URL identified in the threat intelligence (SHA-256 or URL). More broadly, consider implementing policies that require code review for any artifacts downloaded from public ML repositories before execution.
Credential Reset: Because the payload is an infostealer, assume that credentials (browser sessions, saved passwords, and potentially SSH keys) have been compromised. Force a password reset for all accounts accessed on the affected machines and invalidate active session tokens.
Remove Artifacts: Delete the cloned repository and any downloaded ZIP files from user workstations. Check for persistence mechanisms (e.g., Scheduled Tasks or Registry Run keys) that the malware may have established.
Policy Enforcement: Update acceptable use policies to prohibit the direct execution of code from "Downloads" or unverified sources. Encourage the use of dedicated, sandboxed environments for testing open-source code.

Related Resources

Security Arsenal Alert Triage Automation AlertMonitor Platform Book a SOC Assessment platform Intel Hub

Hugging Face \"Privacy Filter\" Impersonation: Infostealer Detection and Remediation

Technical Analysis

Detection & Response

Remediation

Related Resources

Is your security operations ready?