The 5 AI Coding Security Risks Every Engineering Team Faces in 2026
The five most common ways AI coding tools leak proprietary code, with real examples and how to prevent each one.
Why This Matters Now
AI coding tools are no longer optional for competitive engineering teams. Cursor, Claude Code, GitHub Copilot, and similar tools are installed on the majority of developer machines at companies that move fast. But most teams deployed these tools before security teams understood what they actually transmit.
The answer is: more than most developers realize.
Here are the five most common IP exposure vectors, with real-world examples and exactly how to prevent each one.
---
Risk 1: Function Names Reveal Business Logic
The most obvious risk is also the most underestimated. When you send code to an LLM for review or refactoring, every function name is transmitted verbatim to a third-party server.
Consider what a function name like `getUserPaymentTier` reveals:
- You have a tiered pricing model - Payment tiers are tied to user accounts (not company-level) - There is a programmatic lookup function (meaning it is called frequently, meaning pricing logic is automated)
A competitor who sees this function name in a training corpus or API log learns your product architecture without ever reading your codebase.
**Real example pattern:**
// What gets sent to the LLM API (and potentially logged/trained on):
async function getUserPaymentTier(userId: string): Promise<PricingTier> {
const sub = await stripe.subscriptions.list({ customer: userId });
return classifyTier(sub.data[0]?.plan?.interval, sub.data[0]?.quantity);function classifyTier(interval: string, seats: number): PricingTier { if (interval === 'year' && seats > 50) return 'enterprise'; if (seats > 10) return 'business'; return 'starter'; } ```
From this single function, a competitor learns: your three pricing tiers (starter, business, enterprise), your enterprise threshold (50 seats), and that you use annual billing as an enterprise qualifier.
**How Pretense prevents this:**
Pretense mutates identifier names before they leave your machine. The LLM receives:
async function _fn4a2b(_v7c1d: string): Promise<_cls9e3f> {
const _v2b8a = await stripe.subscriptions.list({ customer: _v7c1d });
return _fn6d4e(_v2b8a.data[0]?.plan?.interval, _v2b8a.data[0]?.quantity);function _fn6d4e(_v1a9c: string, _v5f3b: number): _cls9e3f { if (_v1a9c === 'year' && _v5f3b > 50) return 'enterprise'; if (_v5f3b > 10) return 'business'; return 'starter'; } ```
The logic is fully preserved. The LLM can still review, refactor, and complete the code correctly. But no proprietary business logic is revealed through naming.
---
Risk 2: Variable Names Expose Architecture
Variable names are the second most dangerous category. They reveal infrastructure choices, vendor relationships, and internal system topology.
// These variable names reveal: cloud provider, storage vendor,
// internal service topology, and authentication approach
const awsS3BucketName = process.env.STORAGE_BUCKET;
const internalApiEndpoint = 'https://internal.api.company.com';
const redisSessionStore = new RedisClient({ url: process.env.REDIS_URL });
const datadogTracer = require('dd-trace').init({ service: 'payments-api' });What a threat actor or competitor learns from this context window:
- You use AWS S3 for storage - You have an internal API at a specific subdomain pattern - You use Redis for sessions - You use Datadog for observability, with a service named 'payments-api'
This is infrastructure reconnaissance delivered via AI API logs.
**How Pretense prevents this:**
Variable names are mutated with the same deterministic algorithm as function names. `awsS3BucketName` becomes `_v3c7a`. The infrastructure details are never transmitted.
---
Risk 3: Comments Become Training Data
Code comments are the highest-risk single artifact in any codebase. Developers write things in comments they would never put in documentation.
// TODO: fix HIPAA violation in this flow — patient IDs being logged to stdout
// HACK: skipping auth check here because legacy API doesn't support tokens
// NOTE: this algorithm is patented (US 10,234,567) — do not publish
// WARNING: Acme Corp data must never touch this endpoint per contract clause 4.2Every one of these comments represents:
- A live compliance violation (HIPAA) - A known security bypass in production - Patented IP with a public identifier - A contractual obligation that, if violated, creates legal liability
When these comments are sent to an LLM API, they are transmitted to infrastructure you do not control. Training opt-out provisions help, but they are contractual promises, not technical guarantees.
**How Pretense prevents this:**
Pretense's secret scanner pattern-matches comments against a set of high-risk patterns: PHI references, explicit security bypass markers, patent references, and customer-identifying terms. Flagged comments are either blocked or stripped before transmission. The developer is notified with the reason.
---
Risk 4: Secrets in Context Windows
This risk is more subtle than pasting an API key into a prompt. It is about what the LLM infers from surrounding code.
You probably do not paste your `.env` file. But you send code like this:
const openaiClient = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
organization: process.env.OPENAI_ORG_ID,const anthropicClient = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY, });
const stripeClient = Stripe(process.env.STRIPE_SECRET_KEY); ```
The LLM now knows you have API keys for OpenAI, Anthropic, and Stripe in your environment. This is vendor intelligence. More dangerously, if a developer has previously sent a partially populated .env to a different AI session, and the LLM provider correlates sessions (even anonymized), inference attacks become possible.
The direct risk is simpler: developers regularly autocomplete environment variable names in test files, and those completions reflect real key names from their environment.
// What a developer types:// What autocomplete suggests (based on context from prior sessions): const testApiKey = process.env.STRIPE_SECRET_KEY_PROD; ```
**How Pretense prevents this:**
Pretense's secret scanner detects over 30 patterns of credentials, tokens, and keys. Any secret matching a known pattern is blocked before the context window is transmitted. Environment variable references are flagged for review.
---
Risk 5: Refactoring Requests Expose Complete Internal APIs
The highest-volume risk for teams actively using AI tools is refactoring. When a developer asks Claude or Copilot to refactor a service, they typically provide the full service file as context.
A complete service file contains:
- Every method signature (the full internal API surface) - Every dependency injection point (the service graph) - Every database query pattern (the data model) - Every error handling path (the failure modes)
// A typical refactoring context window reveals:
class PaymentOrchestrationService {
constructor(
private readonly paymentGateway: StripeGateway,
private readonly fraudDetection: AcmeFraudAPI, // vendor name
private readonly settlementQueue: BullQueue,
private readonly auditLogger: ComplianceAuditLogger,
private readonly notificationService: TwilioNotifier // vendor nameasync processPayment(request: PaymentRequest): Promise<PaymentResult> { ... } async retryFailedPayment(paymentId: string): Promise<void> { ... } async reconcileSettlement(batchId: string): Promise<ReconciliationReport> { ... } async generateAuditReport(startDate: Date, endDate: Date): Promise<AuditLog> { ... } } ```
This is your complete payment orchestration architecture, vendor relationships, and API contract, delivered to a third-party server in a single refactoring request.
**How Pretense prevents this:**
Class names, constructor parameter names, method names, and injected service names are all mutated. The LLM sees the structural pattern and can refactor it correctly. It never sees `AcmeFraudAPI`, `ComplianceAuditLogger`, or `TwilioNotifier`.
---
The Bottom Line
None of these risks require a breach. They require only that AI provider infrastructure is accessible to parties you have not authorized, which is always true by definition when you send API requests to a third party.
The five risks above are not theoretical. They are the normal usage patterns of every engineering team that uses AI coding tools.
Pretense addresses all five with a single proxy layer that adds less than 2ms of latency to every request.
[Scan your codebase free at pretense.ai/early-access](/early-access)
Share this article