Presidio analyzer package
The Presidio analyzer is a Python based service for detecting PII entities in text.
During analysis, it runs a set of different PII Recognizers, each one in charge of detecting one or more PII entities using different mechanisms.
Presidio analyzer comes with a set of predefined recognizers, but can easily be extended with other types of custom recognizers. Predefined and custom recognizers leverage regex, Named Entity Recognition and other types of logic to detect PII in unstructured text.
Use the following button to deploy presidio analyzer to your Azure subscription.
from presidio_analyzer import AnalyzerEngine
# Set up the engine, loads the NLP module (spaCy model by default) and other PII recognizers
analyzer = AnalyzerEngine()
# Call analyzer to get results
results = analyzer.analyze(text="My phone number is 212-555-5555",
entities=["PHONE_NUMBER"],
language='en')
print(results)
Additional documentation on installation, usage and extending the Analyzer can be found under the Analyzer section of Presidio Documentation