Introduction
Tempus AI, a publicly traded healthcare artificial intelligence company specializing in precision medicine and genomic analysis, is currently facing multiple class-action lawsuits alleging unauthorized collection and disclosure of patients' genetic data. The lawsuits claim that Tempus collected genomic data from patients without obtaining proper consent and subsequently shared this highly sensitive information with third parties, including Google, for data training and analysis purposes.
For healthcare defenders, this incident represents a critical wake-up call about the risks associated with third-party AI platforms processing Protected Health Information (PHI) and genetic data. Unlike traditional ransomware or malware attacks, this incident involves business logic abuse and improper data handling practices that can fly under the radar of conventional security controls. Genetic data is uniquely sensitive—it's immutable, identifiable, and carries implications not just for the patient but for their blood relatives. Organizations must immediately audit their AI vendors' data handling practices and implement robust monitoring to detect unauthorized PHI exfiltration before it becomes a headline-making breach.
Technical Analysis
Affected Products and Platforms
- Primary Platform: Tempus AI genomic analysis and precision medicine platform
- Data Types: Genetic sequences (DNA/RNA), genomic variants, diagnostic reports, patient identifiers
- Third-Party Destinations: Google Cloud Platform (for data processing/training), potentially other AI/ML services
Attack Vector and Mechanism
This is not a technical vulnerability (CVE) or malware-based attack. Instead, the alleged unauthorized disclosure occurs through:
-
Business Logic Abuse: The platform's data sharing functionality may have transmitted genomic data to third-party services beyond what patient consent agreements permitted
-
Improper Data Classification: Genetic data may not have been properly marked as restricted PHI, allowing automated data pipelines to include it in third-party sharing operations
-
API Data Leakage: Application programming interfaces (APIs) connecting to third-party services (Google Cloud, AI training endpoints) may have transmitted full genomic datasets rather than anonymized aggregates
-
Consent Management Gaps: The platform's consent tracking system allegedly failed to enforce patient preferences regarding genetic data sharing
Compliance Implications
- HIPAA Violations: Unauthorized disclosure of PHI carries penalties up to $1.5 million per violation category
- GINA Violations: The Genetic Information Nondiscrimination Act (GINA) imposes strict requirements on genetic data handling
- State Privacy Laws: CCPA/CPRA (California) and other state laws provide additional protections for genetic information
Exploitation Status
This is active litigation with allegations confirmed in court filings. While no CVE exists, the business process weaknesses represent a systemic risk across healthcare organizations utilizing AI/ML services. The CISA Known Exploited Vulnerabilities (KEV) catalog does not apply, but HHS OCR guidance on third-party data sharing is directly relevant.
Detection & Response
Executive Takeaways
-
Conduct Immediate Vendor Data Mapping: Audit all AI/ML vendors to map exactly what data types are collected, stored, processed, and shared. Require documentation of data lineage from ingestion through any third-party transmissions.
-
Implement Genetic Data Classification Tagging: Deploy automated data classification (DLP) that specifically identifies genomic sequences, variant data, and genetic reports. Tag this data with the highest restriction level and enforce policy-based controls preventing external sharing without explicit patient consent.
-
Deploy Egress Monitoring for Third-Party Services: Monitor all data transfers to cloud platforms (GCP, AWS, Azure) and AI service endpoints. Implement baselines for normal data volumes and alert on anomalous large transfers of healthcare-related data.
-
Establish API Governance Framework: Inventory all API endpoints connecting healthcare applications to external services. Implement API security gateways with schema validation to ensure only properly anonymized/aggregated data is transmitted to third parties.
-
Enhance Consent Management Audit Trails: Deploy logging that captures every instance where patient data is accessed or transmitted, correlated with the specific consent permissions. Regular audit reports should identify any data actions that exceed consent boundaries.
Detection Content
---
title: Potential Genetic Data Exfiltration to Cloud Storage Services
id: 8a4f2b91-5d3e-4c7a-9f12-6b8e3c4d5a6f
status: experimental
description: Detects potential exfiltration of genomic/healthcare data to cloud storage platforms from healthcare applications.
references:
- https://www.hipaajournal.com/tempus-ai-class-action-alawsuit-genetic-data-disclosures/
author: Security Arsenal
date: 2024/12/06
tags:
- attack.exfiltration
- attack.t1567
logsource:
category: network_connection
product: windows
detection:
selection:
DestinationHostname|contains:
- '.googleapis.com'
- '.googleusercontent.com'
- '.amazonaws.com'
- '.azure.com'
- '.blob.core.windows.net'
- 'storage.googleapis.com'
Initiated: 'true'
filter_legitimate:
Image|contains:
- '\Program Files\'
- '\Program Files (x86)\'
- '\Windows\System32\'
filter_browsers:
Image|endswith:
- '\chrome.exe'
- '\firefox.exe'
- '\edge.exe'
- '\msedge.exe'
condition: selection and not 1 of filter*
falsepositives:
- Authorized healthcare cloud backup operations
- Legitimate application updates
level: high
---
title: Healthcare Application Executing Suspicious Data Transfer Commands
id: 3c7d1e95-8f4a-2b6c-1d4e-9a0f5b2c3d4e
status: experimental
description: Detects healthcare applications or services executing commands that may indicate data export or exfiltration activities.
references:
- https://www.hipaajournal.com/tempus-ai-class-action-alawsuit-genetic-data-disclosures/
author: Security Arsenal
date: 2024/12/06
tags:
- attack.execution
- attack.t1059
- attack.exfiltration
- attack.t1041
logsource:
category: process_creation
product: windows
detection:
selection_genomic:
CommandLine|contains:
- 'genome'
- 'genetic'
- 'dna_'
- 'variant'
- 'sequenc'
- 'fasta'
- 'vcf'
- 'bam'
- 'sam'
selection_export:
CommandLine|contains:
- 'copy '
- 'xcopy '
- 'robocopy '
- 'upload'
- 'upload_to'
- 'gsutil '
- 'aws s3 '
- 'az storage'
- 'curl '
- 'wget '
- 'scp '
- 'rsync '
- 'export'
condition: all of selection_*
falsepositives:
- Authorized genomic data processing workflows
- Legitimate backup operations
level: medium
---
title: Unusual Database Query Patterns Suggesting Bulk Data Extraction
id: 7b2e4d8c-1a5f-9e3b-4c6d-8f0a2b1c3d4e
status: experimental
description: Detects unusual database query patterns that may indicate bulk extraction of sensitive healthcare or genetic data.
references:
- https://www.hipaajournal.com/tempus-ai-class-action-alawsuit-genetic-data-disclosures/
author: Security Arsenal
date: 2024/12/06
tags:
- attack.collection
- attack.t1005
- attack.exfiltration
- attack.t1074
logsource:
category: database
product: windows
detection:
selection_tables:
Query|contains:
- 'patient'
- 'genome'
- 'genetic'
- 'sequence'
- 'variant'
- 'dna'
- 'biomarker'
- 'clinical'
selection_bulk:
Query|contains:
- 'SELECT *'
- 'SELECT ALL'
- 'INTO OUTFILE'
- 'DUMP TABLE'
- 'BULK INSERT'
- 'xp_cmdshell'
- 'OPENROWSET'
selection_aggregation:
Query|contains:
- 'GROUP BY'
- 'COUNT(*)'
- 'SUM('
filter_normal_reporting:
Query|contains:
- 'WHERE create_date >'
- 'WHERE date >'
- 'WHERE timestamp >'
condition: selection_tables and (selection_bulk or selection_aggregation) and not filter_normal_reporting
falsepositives:
- Legitimate reporting queries
- Authorized data analytics processes
level: medium
// Hunt for potential genomic/PHI data exfiltration to cloud services
// Query: Data transfers to cloud platforms from healthcare workstations
let CloudDomains = dynamic(['.googleapis.com', '.googleusercontent.com', '.amazonaws.com', '.azure.com', '.blob.core.windows.net', 'storage.googleapis.com', '.dropboxapi.com', '.box.com']);
let HighRiskProcesses = dynamic(['python.exe', 'python3.exe', 'node.exe', 'java.exe', 'powershell.exe', 'pwsh.exe', 'cmd.exe', 'bash', 'curl', 'wget']);
DeviceNetworkEvents
| where RemoteUrl has_any (CloudDomains)
| where InitiatingProcessFileName in~ (HighRiskProcesses) or InitiatingProcessFolderPath !contains @"\Program Files" and InitiatingProcessFolderPath !contains @"\Windows"
| where BytesReceived > 1048576 or BytesSent > 1048576 // More than 1MB
| project Timestamp, DeviceName, InitiatingProcessAccountName, InitiatingProcessFileName, InitiatingProcessCommandLine, RemoteUrl, RemotePort, BytesSent, BytesReceived, LocalIP, RemoteIP
| order by Timestamp desc
// Hunt for unusual file access patterns in genomic data directories
// Query: Access to genetic/genomic file types from non-standard processes
let GenomicExtensions = dynamic(['.fasta', '.fa', '.fastq', '.fq', '.bam', '.sam', '.vcf', '.vcf.gz', '.cram', '.gff', '.gff3', '.bed', '.bedgraph', '.wig', '.bigwig', '.h5', '.hdf5']);
DeviceFileEvents
| where FileName has_any (GenomicExtensions)
| where ActionType in ('FileCreated', 'FileAccessed', 'FileModified')
| where InitiatingProcessFileName !in~ ('explorer.exe', 'code.exe', 'notepad++.exe', 'notepad.exe', 'wordpad.exe')
| where InitiatingProcessFolderPath !contains @"\Program Files" and InitiatingProcessFolderPath !contains @"\Windows"
| project Timestamp, DeviceName, FileName, FolderPath, InitiatingProcessFileName, InitiatingProcessCommandLine, InitiatingProcessAccountName, ActionType
| order by Timestamp desc
-- Hunt for processes accessing genomic data files and potentially exfiltrating
-- This artifact targets healthcare endpoints where genetic data may be processed
LET GenomicExtensions = ['.fasta', '.fa', '.fastq', '.fq', '.bam', '.sam', '.vcf', '.vcf.gz', '.cram', '.gff', '.gff3', '.bed', '.bedgraph', '.wig', '.bigwig', '.h5', '.hdf5']
-- Find processes with open handles to genomic files
SELECT
Pid,
Name AS ProcessName,
CommandLine,
Username,
Exe AS ProcessPath,
Ctime AS ProcessCreateTime,
count(Handle) AS FileHandleCount
FROM foreach(
SELECT * FROM glob(globs='/Users/**/*', root='/')
WHERE Name =~ GenomicExtensions
),
{
SELECT * FROM handles(pid=Pid)
WHERE Type = "File" AND Name =~ GenomicExtensions
}
)
GROUP BY Pid, Name, CommandLine, Username, Exe, Ctime
HAVING FileHandleCount > 0
-- Identify network connections from processes accessing genomic data
SELECT
Pid,
ProcessName,
RemoteAddress,
RemotePort,
Family,
State,
Username
FROM foreach(
SELECT * FROM pslist()
WHERE Name IN ('python', 'python3', 'node', 'java', 'curl', 'wget', 'rsync', 'scp')
OR CommandLine =~ '(genome|genetic|dna|variant|sequence)'
),
{
SELECT * FROM netstat(pid=Pid)
WHERE State =~ 'ESTABLISHED'
AND (RemotePort IN (443, 80, 22) OR RemoteAddress =~ '^(10\.|172\.(1[6-9]|2[0-9]|3[0-1])\.|192\.168\.)' = FALSE)
}
)
# PowerShell Script: Healthcare AI Vendor Data Handling Audit
# This script helps audit data handling practices for healthcare AI vendors
# Usage: .\Audit-HealthcareAIVendorDataHandling.ps1 -VendorName "Tempus" -DataPath "C:\HealthcareData"
param(
[Parameter(Mandatory=$true)]
[string]$VendorName,
[Parameter(Mandatory=$true)]
[string]$DataPath,
[string]$OutputPath = ".\VendorAudit-$(Get-Date -Format 'yyyyMMdd').csv"
)
# Define genomic/PHI file patterns
$GenomicExtensions = @('*.fasta', '*.fa', '*.fastq', '*.fq', '*.bam', '*.sam', '*.vcf', '*.vcf.gz', '*.cram', '*.gff', '*.gff3', '*.bed', '*.h5', '*.hdf5')
$PHIPatterns = @('*patient*', '*medical*', '*clinical*', '*diagnosis*', '*treatment*', '*medication*', '*labresult*', 'phi*', 'protected*')
# Initialize results array
$AuditResults = @()
Write-Host "Starting Healthcare AI Vendor Data Handling Audit..." -ForegroundColor Cyan
Write-Host "Vendor: $VendorName" -ForegroundColor Yellow
Write-Host "Data Path: $DataPath" -ForegroundColor Yellow
# Check if data path exists
if (-not (Test-Path -Path $DataPath)) {
Write-Error "Data path does not exist: $DataPath"
exit 1
}
# 1. Scan for genomic data files
Write-Host "" -ForegroundColor Cyan
Write-Host "[1] Scanning for genomic data files..." -ForegroundColor Cyan
foreach ($ext in $GenomicExtensions) {
$files = Get-ChildItem -Path $DataPath -Filter $ext -Recurse -ErrorAction SilentlyContinue
foreach ($file in $files) {
$fileHash = (Get-FileHash -Path $file.FullName -Algorithm SHA256 -ErrorAction SilentlyContinue).Hash
$AuditResults += [PSCustomObject]@{
Timestamp = Get-Date -Format 'yyyy-MM-dd HH:mm:ss'
Vendor = $VendorName
CheckType = 'GenomicDataFile'
Finding = $file.FullName
FileSize = $file.Length
Hash = $fileHash
LastModified = $file.LastWriteTime
Status = 'Found'
}
}
}
# 2. Scan for potential PHI files
Write-Host "" -ForegroundColor Cyan
Write-Host "[2] Scanning for potential PHI files..." -ForegroundColor Cyan
foreach ($pattern in $PHIPatterns) {
$files = Get-ChildItem -Path $DataPath -Filter $pattern -Recurse -ErrorAction SilentlyContinue
foreach ($file in $files) {
$AuditResults += [PSCustomObject]@{
Timestamp = Get-Date -Format 'yyyy-MM-dd HH:mm:ss'
Vendor = $VendorName
CheckType = 'PotentialPHIFile'
Finding = $file.FullName
FileSize = $file.Length
Hash = (Get-FileHash -Path $file.FullName -Algorithm SHA256 -ErrorAction SilentlyContinue).Hash
LastModified = $file.LastWriteTime
Status = 'ReviewRequired'
}
}
}
# 3. Check for network connections from vendor software
Write-Host "" -ForegroundColor Cyan
Write-Host "[3] Checking for active network connections from vendor processes..." -ForegroundColor Cyan
$VendorProcesses = Get-Process | Where-Object {
$_.ProcessName -like "*$VendorName*" -or
$_.MainWindowTitle -like "*$VendorName*" -or
$_.Path -like "*$VendorName*"
}
foreach ($proc in $VendorProcesses) {
try {
$connections = Get-NetTCPConnection -OwningProcess $proc.Id -ErrorAction SilentlyContinue
foreach ($conn in $connections) {
if ($conn.State -eq 'Established') {
$remoteEndpoint = try { [System.Net.Dns]::GetHostEntry($conn.RemoteAddress).HostName } catch { $conn.RemoteAddress }
$AuditResults += [PSCustomObject]@{
Timestamp = Get-Date -Format 'yyyy-MM-dd HH:mm:ss'
Vendor = $VendorName
CheckType = 'NetworkConnection'
Finding = "$($proc.ProcessName) (PID: $($proc.Id)) connected to $($remoteAddress):$($conn.RemotePort)"
FileSize = 'N/A'
Hash = 'N/A'
LastModified = 'N/A'
Status = if ($conn.RemoteAddress -match '^(10\.|172\.|192\.168\.)') { 'Internal' } else { 'External' }
}
}
}
} catch {
# Access denied or no connections
}
}
# 4. Check for data export related Scheduled Tasks
Write-Host "" -ForegroundColor Cyan
Write-Host "[4] Checking for data export scheduled tasks..." -ForegroundColor Cyan
$ExportTasks = Get-ScheduledTask | Where-Object {
$_.TaskName -like "*export*" -or
$_.TaskName -like "*upload*" -or
$_.TaskName -like "*sync*" -or
$_.TaskName -like "*$VendorName*"
}
foreach ($task in $ExportTasks) {
$taskInfo = $task | Get-ScheduledTaskInfo
$AuditResults += [PSCustomObject]@{
Timestamp = Get-Date -Format 'yyyy-MM-dd HH:mm:ss'
Vendor = $VendorName
CheckType = 'ScheduledTask'
Finding = "Task: $($task.TaskName), Command: $($task.Actions.Execute), Args: $($task.Actions.Arguments)"
FileSize = 'N/A'
Hash = 'N/A'
LastModified = $taskInfo.LastRunTime
Status = 'ReviewRequired'
}
}
# 5. Check for recent data exports via Windows Event Logs
Write-Host "" -ForegroundColor Cyan
Write-Host "[5] Checking event logs for potential data export activities..." -ForegroundColor Cyan
$ExportEvents = Get-WinEvent -FilterHashtable @{LogName='Security'; ID=4663; StartTime=(Get-Date).AddDays(-7)} -ErrorAction SilentlyContinue |
Where-Object { $_.Message -match 'Object Type:(.*File)' -and ($_.Message -match 'Accesses:.*WriteData|AppendData|Delete') }
foreach ($event in $ExportEvents) {
if ($event.Message -match 'File|Path') {
$AuditResults += [PSCustomObject]@{
Timestamp = $event.TimeCreated
Vendor = $VendorName
CheckType = 'FileAccessEvent'
Finding = $event.Message
FileSize = 'N/A'
Hash = 'N/A'
LastModified = $event.TimeCreated
Status = 'ReviewRequired'
}
}
}
# Export results
$AuditResults | Export-Csv -Path $OutputPath -NoTypeInformation
Write-Host "" -ForegroundColor Cyan
Write-Host "Audit complete. Results saved to: $OutputPath" -ForegroundColor Green
Write-Host "Total findings: $($AuditResults.Count)" -ForegroundColor Yellow
Write-Host "" -ForegroundColor Cyan
Remediation
Immediate Actions for Healthcare Organizations
-
Vendor Assessment and Contract Review
- Action: Immediately review all BAAs (Business Associate Agreements) and contracts with AI/ML vendors processing PHI or genetic data
- Specifics: Verify contracts explicitly prohibit sharing patient genetic data with third parties for model training without separate, documented consent
- Deadline: Complete within 30 days
-
Data Inventory and Classification
- Action: Conduct a comprehensive inventory of all genomic/genetic data stored or processed by third-party AI platforms
- Specifics: Catalog data sources, data types, storage locations, access permissions, and any approved third-party sharing arrangements
- Reference: NIST SP 800-60 Rev. 1, Volume 2 - Mapping Types of Information and Information Systems to Security Categories
-
Consent Management Verification
- Action: Audit patient consent forms and consent tracking systems to verify genetic data sharing permissions
- Specifics: Implement automated checks that block data sharing to third parties unless explicit consent for that purpose is documented
- Tool Category: Consent Management Platforms (CMP)
-
Egress Monitoring Implementation
-
Action: Deploy data loss prevention (DLP) and egress monitoring on all systems handling genetic data
-
Specifics: Implement blocking rules for unauthorized transfers to cloud platforms, AI service endpoints, and third-party APIs
-
Configuration Example:
Block outbound transfers to *.googleapis.com from healthcare applications unless:
- Transmission is encrypted (TLS 1.2+)
- Data is anonymized/aggregated (no direct identifiers)
- Patient consent is verified in consent management system
-
-
API Security Controls
- Action: Implement API security gateways for all healthcare applications connecting to external services
- Specifics: Enable schema validation, payload inspection, and rate limiting to ensure only authorized data types are transmitted
- Vendor Solutions: Apigee, Kong API Gateway, AWS API Gateway with AWS WAF
Long-Term Security Controls
-
Zero Trust Architecture for Data Access
- Implement micro-segmentation for genomic databases and processing systems
- Require MFA for all access to genetic data, including automated service accounts
- Enforce just-in-time (JIT) access with automatic expiration
-
Privacy-Preserving Technologies
- Evaluate implementation of federated learning for AI models (data stays on-premise, only model updates are shared)
- Implement differential privacy techniques for any data sharing to third parties
- Consider homomorphic encryption for processing encrypted genetic data without decryption
-
Enhanced Audit Logging
- Enable immutable audit logs for all genetic data access, modification, and transmission
- Implement SIEM correlation rules detecting unauthorized data sharing patterns
- Establish quarterly external reviews of data access logs
Regulatory Compliance References
- HHS OCR Guidance on Third-Party Data Sharing: https://www.hhs.gov/hipaa/for-professionals/special-topics/third-parties/
- NIH Genomic Data Sharing (GDS) Policy: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-21-011.html
- GAO Report on Protecting Health Privacy: https://www.gao.gov/products/gao-23-105860
Related Resources
Security Arsenal Healthcare Cybersecurity AlertMonitor Platform Book a SOC Assessment healthcare Intel Hub
Is your security operations ready?
Get a free SOC assessment or see how AlertMonitor cuts through alert noise with automated triage.