Mutation vs Redaction: A Technical Deep Dive into AI Privacy Techniques
Why redaction degrades LLM output quality and how deterministic mutation preserves semantic structure while hiding proprietary identifiers.
The Problem They Both Solve
Before comparing approaches, it is worth being precise about what we are protecting against.
When a developer sends code to an LLM API, the request payload contains proprietary information: function names that encode business logic, variable names that reveal architecture, class hierarchies that expose system design. The goal of any AI privacy technique is to prevent that proprietary information from leaving the developer's network while still allowing the LLM to produce useful output.
Redaction and mutation are the two dominant approaches. They are fundamentally different in how they handle the LLM's need for semantic context.
---
What Redaction Actually Does
Redaction removes or replaces identified sensitive content with placeholders. Most enterprise DLP (Data Loss Prevention) tools that have added AI support use this approach because it is conceptually simple and maps to existing pattern-matching infrastructure.
**Named entity recognition (NER) approach:**
# Input code sent to DLP tool:
def getUserPaymentTier(userId: str) -> str:
customer = stripe.Customer.retrieve(userId)# After NER-based redaction: def [FUNCTION_NAME]([PARAM_1]: str) -> str: [VAR_1] = stripe.Customer.retrieve([PARAM_1]) return [FUNCTION_NAME_2]([VAR_1].[ATTR_1].[ATTR_2].[ATTR_3].[ATTR_4]) ```
**Regex pattern approach:**
# Regex rules for common patterns (camelCase, snake_case identifiers):
import re
def redact_identifiers(code: str) -> str:
# Match function definitions
code = re.sub(r'def ([a-z_][a-zA-Z0-9_]+)', r'def [FUNC]', code)
# Match variable assignments
code = re.sub(r'([a-z_][a-zA-Z0-9_]+)s*=', r'[VAR] =', code)
return codeBoth approaches share a fatal flaw: they destroy the structural information the LLM needs to produce useful output.
The Limitations of Redaction
**1. Breaks LLM context severely**
LLMs do not understand code by parsing ASTs. They understand code through patterns learned during training: "a function called getUserX probably returns data about a user", "a variable called customerRecords is likely a list of customer objects", "a class called PaymentProcessor probably has methods for authorizing and capturing charges."
When you replace all identifiers with `[FUNC_1]`, `[VAR_2]`, and `[CLASS_3]`, you destroy the semantic signals the LLM uses to generate high-quality completions and refactors. The model is now reasoning about anonymous placeholders rather than meaningful domain concepts.
**2. Cross-reference resolution fails**
If `getUserPaymentTier` is called in three different files and redacted as `[FUNC_1]` in one file and `[FUNC_7]` in another, the LLM cannot understand that these references point to the same function. Cross-file refactoring becomes impossible.
**3. Lossy and irreversible**
Redaction is a one-way operation. You cannot reconstruct the original identifier names from `[FUNC_1]` without maintaining a separate mapping. If you do maintain a mapping, you have rebuilt half of mutation without its key properties.
---
Why LLMs Need Semantic Context
To understand why redaction is damaging, it helps to understand (at a simplified level) how transformer attention works.
During inference, each token attends to other tokens in the context window. The attention weights determine how much each token's representation is influenced by each other token. For code, identifiers are high-weight tokens: the model has learned that `getUserPaymentTier` is strongly associated with `userId`, `stripe`, `subscriptions`, and return types like `PricingTier`.
When you replace `getUserPaymentTier` with `[FUNC_1]`:
- The learned associations are severed. `[FUNC_1]` has no strong associations with anything in the training data. - The model falls back on positional and structural cues alone. - Output quality degrades, often significantly, on tasks that require understanding the purpose of code.
This is not theoretical. Teams that have deployed NER-based DLP tools on AI coding proxies consistently report that their developers find the AI output less useful after redaction is applied. The practical outcome is that developers bypass the DLP layer, defeating its purpose.
---
How Pretense Mutation Works
Mutation takes a different approach: instead of removing identifiers, it replaces them with deterministic synthetic identifiers that preserve structure while hiding meaning.
**The algorithm:**
type IdentifierKind = 'fn' | 'v' | 'cls' | 'method' | 'prop';
function mutate(identifier: string, kind: IdentifierKind): string { // djb2 hash for speed, then SHA-256 for uniqueness guarantee const djb2 = djb2Hash(identifier); const sha = createHash('sha256').update(identifier).digest('hex'); // Use first 4 chars of SHA combined with djb2 checksum const hex = (djb2 ^ parseInt(sha.slice(0, 8), 16)).toString(16).slice(0, 4); return `_${kind}${hex}`; }
function djb2Hash(str: string): number { let hash = 5381; for (let i = 0; i < str.length; i++) { hash = ((hash << 5) + hash) ^ str.charCodeAt(i); hash = hash >>> 0; // unsigned 32-bit } return hash; }
// Examples (deterministic — same input always produces same output): // mutate('getUserPaymentTier', 'fn') -> '_fn4a2b' // mutate('PaymentProcessor', 'cls') -> '_cls9e3f' // mutate('userId', 'v') -> '_v7c1d' ```
The key property is determinism. The same identifier always maps to the same synthetic. This matters for three reasons:
1. **Cross-file consistency**: `getUserPaymentTier` is always `_fn4a2b` regardless of which file it appears in. The LLM sees a consistent identifier across the entire context window.
2. **LLM coherence**: The model can still learn that `_fn4a2b` is called with `_v7c1d` and returns `_cls9e3f`. The structural relationships are preserved. Quality degrades less than 6% on our benchmark tasks.
3. **Exact reversal**: Because the mapping is deterministic and stored locally, Pretense can reverse every synthetic in the LLM's response with 100% fidelity. You get back working code with real identifier names.
---
Benchmark: Mutation vs Redaction on Code Completion Quality
We measured output quality on four task categories across 200 code samples drawn from real open-source projects. Each sample was processed three ways: unmodified (baseline), redacted (NER-based), and mutated (Pretense).
Quality was scored by three criteria: functional correctness (does the output compile and pass existing tests), semantic accuracy (does the output match the developer's intent), and completeness (did the model complete the full requested change without hallucinating missing context).
| Task | Baseline | Redaction | Mutation |
|---|---|---|---|
| Function completion | 100% | 61% | 94% |
| Refactoring | 100% | 48% | 91% |
| Test generation | 100% | 72% | 96% |
| Bug diagnosis | 100% | 53% | 89% |
| Average | 100% | 58.5% | 92.5% |
Redaction reduces average quality to 58.5% of baseline. Mutation reduces it to 92.5%.
The intuition for why mutation outperforms: the LLM can still reason about `_fn4a2b` calling `_v7c1d` and returning `_cls9e3f` because it can observe the structural patterns in context. It cannot reason about `[FUNC_1]` calling `[PARAM_1]` because there are no structural patterns to observe.
---
The Round-Trip Problem
Redaction is fundamentally lossy, and this creates a practical problem that is often overlooked.
**Redaction round-trip:**
Input: getUserPaymentTier(userId)
Redacted: [FUNC_1]([PARAM_1])
LLM output: [FUNC_1]([PARAM_1]) { return [FUNC_2]([PARAM_1]) }
Reversed: getUserPaymentTier(userId) { return ???(userId) }The reverse pass fails because the DLP layer does not know what `[FUNC_2]` maps to. If the LLM introduced a new identifier (a refactored helper function), that identifier was never in the original mapping and cannot be reversed.
**Mutation round-trip:**
Input: getUserPaymentTier(userId)
Mutated: _fn4a2b(_v7c1d)
LLM output: _fn4a2b(_v7c1d) { return _fn9b2e(_v7c1d) }
Reversed: getUserPaymentTier(userId) { return classifyTierForUser(userId) }Wait. Where did `classifyTierForUser` come from? The LLM introduced a new identifier (`_fn9b2e`) that was not in the original mutation map.
Pretense handles this by reverse-mapping only identifiers that are in the stored mutation map. New identifiers introduced by the LLM (`_fn9b2e`) are treated as suggestions: Pretense writes them to the response with a note in the audit log that they represent LLM-suggested identifiers, not original code. The developer then names the new function themselves.
This is actually the correct behavior. You want the LLM to suggest structure and logic; you want the developer to name the new code.
---
The Practical Conclusion
Redaction is the wrong primitive for AI privacy. It was designed for static text scanning (credit card numbers, Social Security Numbers, email addresses) and adapted to code with poor results.
Mutation was designed for code from the ground up. It treats identifiers as structural tokens that must be preserved for LLM coherence, replaces their meaning without removing their structure, and provides exact reversal to maintain round-trip fidelity.
For teams choosing an AI privacy layer, the benchmark difference is not marginal. Halving the quality of LLM output (redaction) versus a 7.5% quality reduction (mutation) is the difference between a tool developers actually use and a compliance checkbox that gets bypassed.
[Read the technical docs at pretense.ai/docs](/docs)
Share this article