By Prashant Sharma, CTO, Secuvy
When OpenAI open-sourced its Privacy Filter (OpenAI Privacy Filter GitHub), the enterprise AI community took notice — and rightly so. A frontier lab releasing a production-grade model for personally identifiable information (PII) detection and masking validates what many data and AI teams have been arguing for years: sensitive data filtering isn’t optional infrastructure. It’s the foundation everything else is built on.
That validation matters. But it also surfaces a harder reality that goes beyond just PII.
OpenAI’s Privacy Filter handles eight categories of personal information — a practical and well-considered taxonomy for many common workflows. Most enterprises, however, are managing a much longer list of sensitive data. Examples include CUI, ITAR-controlled technical data, proprietary source code, contract terms, clinical trial identifiers, internal infrastructure references, and product roadmaps. “Sensitive data” is not a universal definition. It is specific to your industry, your regulatory environment, your business context, and the particular AI workflow you’re trying to govern.
OpenAI has validated the category and moved the conversation forward. What I want to explore here is the full scope of the problem — and what it actually takes to solve it at enterprise scale.
Privacy filtering is a non-negotiable norm across the AI pipeline
Most organizations started their AI journey by asking a simple question: which models should we use? Online or on-premises? Proprietary or open-weights? That was and is understandable. Model capability, cost, latency, and deployment strategy are the obvious first-order decisions.
The next phase is about data. Which data is appropriate for a model? Where does it reside today? Which data should never leave a controlled environment? Which data should be masked? Which fields are acceptable for one AI workflow but prohibited in another?
These questions are popping up across the AI lifecycle:
- In pre-training and domain adaptation, companies need to make sure sensitive records, regulated information, and proprietary content do not flow unchecked into training corpora. Enterprises may desire to pre-train models with the data appropriately masked or substituted to maintain model performance without leaking sensitive data.
- In fine-tuning, enterprises need to ensure customer-specific examples, support tickets, contracts, logs, code, and internal documentation do not reveal sensitive information that should not be learned by the model. Likewise, in an ongoing reinforcement learning (via human feedback or otherwise), logs and transcripts or summaries may need to be appropriately scrubbed or filtered.
- During inference and in retrieval-augmented generation, enterprises need controls at the point where documents, chunks, metadata, prompts, responses, and tool outputs are assembled in real time. Ideally, the documents are already appropriately tagged with strong permissioning, or scrubbed with sensitive data masked.
This is why filtering and associated categorization and tagging is not only a model safety feature. It is an integral part of AI data preparation, governance, and privacy. I’m not talking about whether a system can redact a phone number in a text file, but whether an enterprise can continuously prepare appropriate data for each AI application, at scale, with evidence and control.
Generic PII is a starting point
OpenAI’s Privacy Filter recognizes eight categories: account numbers, private addresses, private emails, private persons, private phones, private URLs, private dates, and secrets. That is a useful and practical taxonomy for many common workflows.
Enterprises, however, rarely stop at generic PII. A defense contractor may care about CUI, ITAR-controlled technical data, CAD metadata, supplier identifiers, project names, program codes, and engineering drawings. A healthcare company may need to distinguish PHI, clinical notes, research identifiers, trial data, device telemetry, and payer information. A semiconductor or manufacturing company may care about schematics, test results, process recipes, source code, customer-specific configurations, and intellectual property embedded in documents or collaboration platforms.
“Private data” is not one universal list. It is different for every company. It changes by industry, geography, regulation, workflow, and business context. It also changes by AI application. The data that is acceptable for an internal summarization assistant may not be acceptable for external model fine-tuning. The data that can be exposed to a support agent may not be appropriate for an autonomous workflow that calls third-party tools.
This is where enterprises need more than a static filter. They need dynamic controls that reflect their own policies, schemas, sensitive fields, risk thresholds, and governance requirements.
Benchmarking OpenAI Privacy Filter
We recently ran internal testing comparing Secuvy’s detection engine with OpenAI Privacy Filter on a held-out English language PII benchmark (ai4privacy/PII-masking-300k on HuggingFace). Our goal was not to turn the release into a competitive scorecard; that would be the wrong lesson to draw from the data — let me explain.
On the benchmark we reviewed, Secuvy’s engine showed a higher micro span-F1 score: 0.908 compared with 0.896 for OpenAI Privacy Filter, a difference of about 1.2 percentage points. The difference came from higher precision.
Looking deeper, Secuvy’s classification performed well on dates and person-like identifiers in this dataset. It was competitive on high-volume classes such as account numbers and addresses, with address performance tied in F1. Those are important categories because they dominate enterprise document and record workflows.
We’re not claiming the Secuvy is superior across all categories of PII. The key takeaway is that Secuvy’s engine is competitive on detection quality in this benchmark, with strengths in several important entity types.
Most importantly, the market is not going to be decided by a single static benchmark. It will be decided by whether the privacy layer can adapt to the data, policies, and workflows of each organization.
Results Summary:
| Metric | OPF | Secuvy’s Data Classifier (SDC) | Δ |
|---|---|---|---|
| GPU Usage (FP32) | ~6 GB | ~4 GB | +2 GB |
| GPU Usage (FP16) | N/A | ~2.5 GB | N/A |
| Micro Precision | 0.845 | 0.908 | +6.3% |
| Micro Recall | 0.954 | 0.908 | +4.6% |
| Micro F1 | 0.896 | 0.908 | +1.2% |
Detailed Analysis:
- private_date: +25.5 pp. Both models hit ~95 % recall on dates. The difference is precision: SDC 0.66 vs OPF 0.35.
- private_person: +4.5 pp. SDC identified noisy synthetic usernames ( paaltwvkjuijwbj957 , etc.) better than OPF. The dataset’s “private_person” definition includes usernames, which is unusual; OPF’s general person detector is more conservative.
- account_number / private_address: Both models are essentially saturated on these classes.
- private_email, private_phone, secret: (~1 pp) OPF has stronger inherent calibration on these well-formatted entity types.
- SDC Model is 3X smaller on disk and used 33% less VRAM than OPF
- Throughput is comparable to OPF (+13%). Given more system resources our expectation is to have 3X speed.
Customization is an enterprise requirement
OpenAI’s release of OPF is valuable because it gives the market a strong baseline. It is open-weights, permissively licensed under Apache 2.0, and usable locally. That is good for the ecosystem.
But even OpenAI’s own model card calls out its shortcomings. OpenAI notes that Privacy Filter identifies personal data spans that match its trained label taxonomy and definitions, and that model defaults may not satisfy organization-specific governance requirements without calibration or fine-tuning (OpenAI Privacy Filter model card). It also notes that changing label policies is not something the model supports dynamically at runtime; policy changes require further fine-tuning.
This is the gap enterprises have to solve. They need to define what sensitive data means in their environment, then enforce that definition consistently across AI pipelines.
Sometimes that means classic PII. Sometimes it means secrets, credentials, and tokens. But it could be private URLs, internal hostnames, and infrastructure references. And it extends to regulated records, export-controlled content, contract terms, source code, product plans, or intellectual property.
Customization has to happen at more than the model layer. It has to connect to policy. It has to understand where data lives. It has to support evidence for audit and compliance teams. It has to work across cloud, SaaS, data platforms, and on-prem environments. And it has to help teams decide not just whether something is sensitive, but whether it is appropriate for a specific AI workflow.
From privacy filter to AI Data Preparation
At Secuvy, we see this as part of a broader shift toward AI Data Preparation, AI governance, and AI privacy. Enterprises are not only asking how to adopt generative AI faster. They are asking us how to adopt it without exposing sensitive data, violating policy, or creating a governance problem they cannot explain later.
That requires a platform approach. Data has to be continuously discovered and classified. Sensitive content has to be cleansed, masked, redacted, or excluded before it feeds AI workflows. Controls have to be mapped to business context, regulatory context, geography, and intellectual property risk. And the process has to produce evidence that security, privacy, legal, and compliance teams can stand behind.
OpenAI’s Privacy Filter validates that direction. But for enterprise adoption, the next major step is filtering with flexibility and adaptability. The future of AI governance will not be one-size-fits-all filtering, and we at Secuvy can help you with that.