PII masking for LLMs
Automatically detect and mask personally identifiable information before it reaches your LLM provider. Replace sensitive entities with reversible tokens while preserving prompt utility — a core capability of AI DLP.
Why PII masking matters
Research shows that 11% of enterprise AI prompts contain sensitive data. Employees routinely paste customer records, medical notes, financial statements, and HR data into ChatGPT and Copilot — often without realizing the security implications. Once this data reaches an external AI provider, it is outside your organization's control.
PII masking solves this by detecting sensitive entities in real-time and replacing them with structured tokens before the prompt leaves your organization. The LLM receives a clean prompt, generates a useful response, and your data never reaches the provider. Learn about the broader data protection strategy in our AI data leak prevention guide.
How PII masking works
When a user submits a prompt through any PromptWall-protected surface — browser, editor, or CLI — the detection pipeline runs entity recognition across the prompt text:
- Entity detection — Named entity recognition identifies PII patterns in the prompt text using NLP models and regex patterns.
- Confidence scoring — Each detected entity receives a confidence score (0–100%). Only entities above the configured threshold trigger masking.
- Token replacement — Detected entities are replaced with structured tokens:
{{PERSON_1}},{{EMAIL_1}},{{PHONE_1}}. - Context preservation — Multiple references to the same entity use the same token, maintaining semantic relationships in the prompt.
- Audit logging — The original prompt, masked version, and all detected entities are recorded in the audit trail.
Before and after masking
Here is an example of PII masking in action. The original prompt contains customer data; the masked version preserves the prompt structure while protecting all sensitive entities:
Original Prompt
Please draft an email to John Smith (john.smith@acme.com, +1-555-0123) about his account #4532-1234-5678-9012 payment of $15,420 due on March 15.
Masked Prompt
Please draft an email to {{PERSON_1}}
({{EMAIL_1}}, {{PHONE_1}})
about his account #{{CREDIT_CARD_1}}
payment of {{AMOUNT_1}} due on {{DATE_1}}.Supported entity types
PromptWall detects 30+ entity types out of the box, plus custom patterns defined by tenant administrators.
👤
Personal
- • Person names
- • Email addresses
- • Phone numbers
- • Physical addresses
- • Dates of birth
💳
Financial
- • Credit card numbers
- • IBAN / bank accounts
- • SSN / Tax IDs
- • Financial amounts
🪪
National IDs
- • TC Kimlik (Turkey)
- • Passport numbers
- • Driver license
- • National Insurance
🔑
Technical
- • IP addresses
- • API keys
- • Connection strings
- • AWS credentials
⚙️
Custom
- • Tenant-defined patterns
- • Internal project codes
- • Classification labels
- • Proprietary formats
Configurable detection thresholds
Each entity type has an independent confidence threshold (default 80%). Security teams can adjust thresholds per entity type to balance protection with usability. For example:
- Credit cards — set to 90% to reduce false positives on random number sequences.
- Person names — set to 70% for broader coverage in multilingual environments.
- Custom patterns — define exact match patterns with 100% threshold for zero false positives.
Part of a complete AI DLP strategy
PII masking is one component of AI DLP. A complete data protection strategy also includes document leak detection (semantic similarity against protected corpora), sensitive data classification, and policy enforcement to govern how different data types are handled.
Unlike traditional DLP, which operates at the network layer, PromptWall inspects prompt content at the application layer — inside the browser, editor, and CLI where AI interactions actually happen.
Deploy PII masking today
See PromptWall detect and mask PII in real-time across ChatGPT, Copilot, and API workflows.
Frequently asked questions
What types of PII can PromptWall detect?+
PromptWall detects 30+ entity types including person names, email addresses, phone numbers, credit card numbers, SSNs, passport numbers, national IDs (TC Kimlik, IBAN, etc.), IP addresses, physical addresses, dates of birth, and custom tenant-defined patterns.
Does PII masking break the AI response?+
No. Masked prompts preserve semantic structure using reversible tokens like {{PERSON_1}}, {{EMAIL_1}}. The LLM receives a structurally valid prompt and generates a useful response. The tokens maintain context so the model understands relationships between entities.
Can I define custom entity types for my organization?+
Yes. Tenant administrators can define custom regex patterns and matching rules for proprietary data formats — internal IDs, project codes, classification labels, and any organization-specific sensitive patterns.
What is the difference between masking and blocking?+
Masking replaces detected entities with tokens and allows the prompt to proceed to the LLM. Blocking prevents the entire prompt from being sent. Masking preserves productivity while protecting data; blocking provides maximum security for high-risk scenarios. Both options are configurable per policy rule.
Continue reading
AI Data Leak Prevention
Stop sensitive data from reaching LLM providers.
Document Leak Detection
Semantic similarity against protected corpora.
Sensitive Data in AI Prompts
The hidden enterprise risk of AI data exposure.
DLP for Copilot & ChatGPT
Enterprise data protection for popular AI tools.
Real-Time Prompt Inspection
See what AI sends before it sends.
