Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

README.MD

Mask User Private Indentity Information (PII)

The mask user PII dataflow reads JSON records and uses Regex to mask Social Security Numbers (SSNs). The following diagram is a visual representation of these operations generated by sdf:

Package Variant

If you prefer to run this dataflow using packages, run the package-variant instead.

Step-by-step

Take a look at the dataflow.yaml to see how we've implemented it.

Run the Dataflow

Use sdf command line tool to run the dataflow:

sdf run --ui

Use --ui to generate the graphical representation and run the Studio.

Test the Dataflow

The sample data file used to run this test ./sample-data/data.txt has the following records:

{"name": "Alice", "ssn": "555-12-1212"}
{"name": "Bob", "ssn": "123-45-6789"}

Produce the data to in user-info topic:

fluvio produce user-info -f ./sample-data/data.txt

Checkout the data in user-info topic:

fluvio consume user-info -Bd

Consume from masked to retrieve the result:

fluvio consume masked -Bd
{"name": "Alice", "ssn": "***-**-****"}
{"name": "Bob", "ssn": "***-**-****"}

Clean-up

Exit sdf terminal and clean-up. The --force flag removes the topics:

sdf clean --force

References