Linux command
pii-shield 命令
文本
复制后可按需替换文件名、目录或参数。
常用示例
Detect PII
pii-shield detect "Contact john@example.com for help"
Mask
pii-shield mask "My email is john@example.com" --strategy [replace]
Redact
pii-shield mask "SSN 123-45-6789" --strategy [redact]
Hash
pii-shield mask "Card 4111-1111-1111-1111" --strategy [hash]
Process a file
pii-shield file [input.txt] -o [output.txt] --strategy [redact]
说明
pii-shield is a command-line front-end to a dual-engine PII detection library. Each input is run through Microsoft Presidio locally (using spaCy NER models plus regex patterns), and, when configured, also through Microsoft Foundry / Azure Language Service in the cloud. Results from both engines are merged so that high-confidence cloud detections complement the local ones, while traffic stays local when no Azure endpoint is configured. The library recognises common entity types out of the box, including person names, email addresses, phone numbers, credit-card numbers, social security numbers, IBAN/bank account numbers, IP addresses and URLs. Detected entities are then transformed by one of four masking strategies (replace, redact, mask, hash), making the same tool useful both for safely sharing log/data samples and for producing deterministically-anonymised datasets. The CLI is intended for one-off operations and pipeline integration; the same engine is also reachable through a Python API and a REST server for embedding in larger applications.
参数
- detect _TEXT_
- Print the PII entities found in _TEXT_ (entity type, position, score) without modifying the input.
- mask _TEXT_
- Return _TEXT_ with detected PII rewritten according to --strategy.
- file _INPUT_
- Read PII from _INPUT_ file and emit masked output to stdout or to the path given with -o.
- --strategy _STRATEGY_
- How to rewrite each detected entity. One of:
- -o _FILE_
- Write masked output to _FILE_ instead of standard output (used with file).
FAQ
What is the pii-shield command used for?
pii-shield is a command-line front-end to a dual-engine PII detection library. Each input is run through Microsoft Presidio locally (using spaCy NER models plus regex patterns), and, when configured, also through Microsoft Foundry / Azure Language Service in the cloud. Results from both engines are merged so that high-confidence cloud detections complement the local ones, while traffic stays local when no Azure endpoint is configured. The library recognises common entity types out of the box, including person names, email addresses, phone numbers, credit-card numbers, social security numbers, IBAN/bank account numbers, IP addresses and URLs. Detected entities are then transformed by one of four masking strategies (replace, redact, mask, hash), making the same tool useful both for safely sharing log/data samples and for producing deterministically-anonymised datasets. The CLI is intended for one-off operations and pipeline integration; the same engine is also reachable through a Python API and a REST server for embedding in larger applications.
How do I run a basic pii-shield example?
Run `pii-shield detect "Contact john@example.com for help"` in a terminal, then adjust file names, paths, flags, or remote targets for your system.
What does detect _TEXT_ do in pii-shield?
Print the PII entities found in _TEXT_ (entity type, position, score) without modifying the input.