PII Redaction for AI: How It Works and Why It Matters

The Basic Concept

When an employee uses AI with data containing personally identifiable information (PII), that information flows to external servers. Once there, it may be:

Retained in logs
Used for model training
Stored in ways you don't control
Subject to the AI provider's security posture

PII redaction intercepts this flow. Before the prompt reaches external AI, sensitive elements are identified and removed or replaced. The AI processes the sanitized prompt. When the response returns, the sensitive elements are restored so the output makes sense to the user.

The employee experiences AI as normal. The PII never leaves the controlled environment.

What Counts as PII?

PII encompasses any information that can identify an individual, directly or indirectly. Common categories:

Direct identifiers:

Full names
Social Security numbers
Driver's license numbers
Passport numbers
Email addresses
Phone numbers
Physical addresses

Indirect identifiers:

Date of birth
Gender
Race/ethnicity
Job title + employer
Geographic indicators
Unique characteristics

Contextual identifiers:

Account numbers
Customer IDs
Transaction references
Case numbers

Special categories (higher protection):

Health information
Financial data
Biometric data
Political opinions
Religious beliefs
Sexual orientation

Effective PII redaction must handle all these categories, recognizing them in unstructured text, varied formats, and contextual usage.

Detection Methods

Pattern Matching

The simplest approach: look for patterns that match known PII formats.

Works well for:

Social Security numbers (###-##-####)
Credit card numbers (16 digits with specific prefixes)
Phone numbers (various formats)
Email addresses (text@domain.tld)

Limitations:

False positives: "My order number is 123-45-6789" matches SSN format
False negatives: Non-standard formats missed
No contextual understanding: Can't distinguish "John Smith" the customer from "John Smith" the historical figure

Pattern matching is necessary but not sufficient.

Named Entity Recognition (NER)

Machine learning models trained to identify entities in text: people, organizations, locations, dates, etc.

Works well for:

Names in various formats
Addresses as continuous text
Organizations mentioned in context
Dates and times

Limitations:

Requires good models with broad training
May struggle with unusual names or formats
Contextual ambiguity remains challenging

Modern NER, especially transformer-based models, dramatically outperforms pattern matching for unstructured text.

Contextual Analysis

Understanding meaning, not just pattern or entity type.

Examples:

"Call me at 555-1234" → phone number (context: communication)
"Patient presented with 555-1234 mg dosage" → not a phone number (context: medical)
"John's account balance is $5,432" → PII (John is identifiable, balance is his)
"The average account balance is $5,432" → not PII (aggregate, no individual)

Context-aware detection reduces false positives and catches PII that patterns alone miss.

Domain-Specific Recognition

Some PII types are domain-specific:

Medical record numbers (healthcare)
Account numbers (financial services)
Case numbers (legal)
Employee IDs (HR systems)

Effective PII detection often requires domain customization to recognize industry-specific identifiers.

Redaction Strategies

Once PII is detected, several strategies can apply:

Removal

Simply delete the PII from the prompt.

Example:

Original: "Help me write an email to John Smith about his account balance of $5,432"
Redacted: "Help me write an email to about his account balance of"

Problem: The AI can't produce useful output without context. Removed information creates gaps that break the request.

Masking

Replace PII with fixed mask characters.

Example:

Original: "John Smith's SSN is 123-45-6789"
Masked: "[REDACTED]'s SSN is [REDACTED]"

Better: The AI understands something was there. But restoration is impossible — you've lost which redaction was which.

Tokenization

Replace PII with consistent, reversible tokens.

Example:

Original: "Send email to John Smith (john.smith@example.com) about his $5,432 balance"
Tokenized: "Send email to [PERSON_1] ([EMAIL_1]) about his [AMOUNT_1] balance"
Token map stored locally: PERSON_1 = "John Smith", EMAIL_1 = "john.smith@example.com", AMOUNT_1 = "$5,432"

Best: The AI can work with the structure. Tokens are restored in the response. The user experience is seamless.

Synthetic Substitution

Replace real PII with synthetic equivalents.

Example:

Original: "John Smith, born 03/15/1985, lives at 123 Main St"
Synthetic: "Michael Johnson, born 07/22/1987, lives at 456 Oak Ave"

Useful for: Training data, testing, analytics where structure matters but real values don't.

Limitation: More complex to reverse accurately; typically used for batch processing rather than interactive AI.

The Restoration Challenge

Tokenization only works if restoration works. This is harder than it sounds.

Multi-Turn Conversations

AI interactions often span multiple turns. Tokens must be consistent across turns:

Turn 1:

User: "Help me with John Smith's account" → Sent as "[PERSON_1]'s account"
AI: "What would you like to know about [PERSON_1]'s account?"
Restored to user: "What would you like to know about John Smith's account?"

Turn 2:

User: "His balance is $5,432" → Sent as "[PERSON_1]'s balance is [AMOUNT_1]"
Token map grows: PERSON_1 = "John Smith", AMOUNT_1 = "$5,432"

If Turn 2 used a different token for John Smith, the AI would think two different people are involved. Consistency matters.

AI Rephrasing

The AI might rephrase or restructure token references:

Sent: "Send [PERSON_1] information about [AMOUNT_1]"
AI returns: "[AMOUNT_1] has been communicated to [PERSON_1] via the customer portal"

Restoration must handle tokens appearing in different positions and grammatical contexts.

Partial Tokens

Sometimes the AI generates partial references:

Sent: "[PERSON_1]'s email is [EMAIL_1]"
AI returns: "I'll draft an email to Person 1 at their address"

The restoration layer must recognize "Person 1" as a reference to [PERSON_1] even though formatting changed.

Multiple Values

Complex prompts may have many tokenized values:

"Prepare a summary for [PERSON_1], [PERSON_2], and [PERSON_3] showing their balances of [AMOUNT_1], [AMOUNT_2], and [AMOUNT_3] respectively, with addresses at [ADDRESS_1], [ADDRESS_2], and [ADDRESS_3]."

Token management at scale requires careful implementation.

Performance Considerations

PII redaction must be fast enough that users don't notice latency.

Processing time budget:

User tolerance: ~500ms-1s additional delay feels acceptable
User frustration: >2s delay feels slow
User workaround: Significant delay encourages using unprotected alternatives

Optimization approaches:

On-device processing (avoids network round-trips)
Efficient ML models (smaller models for common patterns)
Caching (remember decisions for repeated patterns)
Parallel processing (inspect while preparing request)

The goal is protection that's effectively invisible to the user.

Accuracy Considerations

False Positives

Detecting PII where none exists:

"The patient's blood pressure was 120/80" → "120/80" flagged as account number
"Meeting at 555 California St" → "555" flagged as phone number prefix
"Contact the John Smith Foundation" → organization name flagged as person

False positives reduce utility. Aggressive redaction that removes non-sensitive information makes AI less useful and creates user frustration.

False Negatives

Missing actual PII:

Unusual name formats not recognized
Non-standard identifier patterns missed
Contextual PII not detected

False negatives create risk. PII that slips through defeats the protection purpose.

The Tradeoff

Perfect accuracy in both directions is impossible. Organizations must choose their tolerance:

High security environments may accept more false positives
High productivity environments may accept more false negatives
Most environments seek balance with tunable thresholds

Transparent reporting on what's being redacted helps users understand and trust the system.

Implementation Architecture

Where Redaction Happens

On-device (recommended for interactive use):

Lowest latency
Works offline and on any network
User's original data never transmitted
Requires endpoint software deployment

Proxy-based:

Centralized management
Works for traffic through managed network
Adds network latency
Misses off-network usage

API integration:

For AI integrated into applications
Redaction in application layer
Developer implementation required

Integration Points

Effective PII redaction integrates at:

Browser extension (web-based AI tools)
Desktop application hooks (native AI apps)
API middleware (programmatic AI access)
Email/messaging integration (AI assistants in communication tools)

Coverage across all AI interaction points ensures comprehensive protection.

Beyond PII

While PII is the common focus, similar techniques apply to other sensitive data:

Source code: Identify and protect proprietary algorithms, credentials, API keys
Financial data: Protect account numbers, transaction amounts, business metrics
Legal content: Protect case names, privileged communications, protected information
Trade secrets: Identify and protect proprietary processes, formulas, methods

The detection methods differ (code patterns vs. entity recognition), but the redaction-restoration approach applies broadly.

The Bottom Line

PII redaction enables AI usage that would otherwise be prohibited. When done well:

Users get full AI functionality
Sensitive data stays protected
Compliance requirements are met
Audit trails demonstrate governance

When done poorly:

Gaps in protection create risk
Excessive false positives frustrate users
Restoration failures break user experience
Performance issues drive workarounds

The difference between "done well" and "done poorly" is in the technical details: detection accuracy, token consistency, restoration reliability, and performance optimization.

The Basic Concept

What Counts as PII?

Detection Methods

Pattern Matching

Named Entity Recognition (NER)

Contextual Analysis

Domain-Specific Recognition

Redaction Strategies

Removal

Masking

Tokenization

Synthetic Substitution

The Restoration Challenge

Multi-Turn Conversations

AI Rephrasing

Partial Tokens

Multiple Values

Performance Considerations

Accuracy Considerations

False Positives

False Negatives

The Tradeoff

Implementation Architecture

Where Redaction Happens

Integration Points

Beyond PII

The Bottom Line

Stop data leakage before it starts