π Comprehensive Guide to File Anonymization
Welcome to the professional documentation for securely anonymizing sensitive data using our platform. This guide will walk you through the entire process, from file upload to applying anonymization configurations and saving your anonymized files.
Step 1: Select New FileSafe Runβ
To begin, select a new FileSafe run from the sidebar. This will initiate the anonymization process.
- Click to Upload or Drag and Drop a file to initiate the process of anonymization.
- On clicking Click to Upload, select the file from your local device that you wish to anonymize.
- Alternatively, you can use one of the available Demo Files to explore and understand the process.
π€ Step 2: Upload Your Fileβ
- Users will be directed to the next screen after the file upload is complete.
- On this screen, "Column" represents the header of the uploaded
.xls
or.csv
file. - For example, if the uploaded file contains 5 columns, the table will display 5 rows, each representing a column header.
π Note: The first row of your file will be treated as the header, and the platform will use it as field names for the anonymization process. Ensure your headers are accurate.
π Step 3: Derive PIIβ
- Use the Detect PII Types function to automatically identify fields containing Personally Identifiable Information (PII) or sensitive data.
- This step will Categorize Personal Information within your uploaded file using AI Leverage.
- The categorization allows users to apply appropriate anonymization rules based on the detected data types.
To initiate the detection process:
- Click on the Detect PII Types with AI button to start the identification and categorization.
Post Detection Results:
- Once the PII detection is completed successfully, the Utility Parameter column will be populated with the detected PII categories such as:
Name
Phone Number
Email
Date
- If no PII data is detected in a column, the value "No Change" will be assigned to that column.
Privacy Relevance Assignment:
- The system also assigns a Privacy Relevance tag to each column, which can be:
Personal
Confidential
Not Relevant
β
Step 4: Select Collaboration Group or Create a New Oneβ
- Collaboration Groups help maintain consistency across different datasets to ensure uniform anonymization across projects.
- Users can either select an existing collaboration group or create a new one to anonymize the file consistently.
- If no collaboration group is selected, a Default Group will be automatically assigned.
To select or create a Collaboration Group:
- Select an existing collaboration group from the drop-down menu under Run Properties.
- Click Create New Collaboration Group to create a new group.
- While creating a new collaboration group, the user must enter the mandatory parameter: Project Name.
π οΈ Step 5: Configure Utility Parameters and Conditionsβ
After uploading your file, itβs time to configure utility parameters and their corresponding conditions for each data field.This signifies how the data in the file needs to be anonymized.
- If the user does not select any Utility Conditions, a Use Default Utility Condition is automatically assigned.
- The default values for Utility Parameters are as follows:
- Email β
DUMMY_DOMAIN
- Name β
FULL_NAME
- Date β
Consistent
- Phone Number β
RANDOM
- Email β
Customizing Utility Parameters and Utility Conditions:
- Users can customize anonymization rules by modifying the Utility Parameter and Utility Conditions for each column.
- The details of available Utility Parameters and Utility Conditions are provided in the reference section (not shown here).
Example:
- In the example File Safe Run:
- The Utility Parameter for the column Customer ID has been changed to
Consistent ID
instead ofNo Change
. - The Utility Conditions have been changed to:
LAST_NAME
for column #1KEEP DOMAIN
for column #5
- The Utility Parameter for the column Customer ID has been changed to
Privacy Relevance:
- Users also have the option to manually change the Privacy Relevance of any column as per their specific requirements.
Utility Parameters and Conditions Tableβ
Utility Parameter | Conditions | Description |
---|---|---|
No Change | N/A | Leaves the data unchanged. |
Clear Values | N/A | Clears or wipes out the data in the selected fields. |
Dummy Domain, Keep Domain, All Caps | Anonymizes email addresses with additional formatting options. | |
Name | First Name, Last Name, All Caps, Full Name, Name | Handles name fields with specific conditions for first name, last name, etc. |
Consistent ID | N/A | Generates a consistent identifier for tracking across records. |
Fixed Value | Text Field for user to enter a Fixed Value | Replaces the field with the fixed value entered by the user in the text field. |
Date | Same Year, Random, Adult, Consistent | Anonymizes date fields while retaining some options for consistency. |
Phone Number | REMOVE_COUNTRY_CODE, RANDOM, CONSISTENT | Add the two conditions |
Number | N/A | Randomizes or anonymizes numerical data. |
Custom Expressions | N/A | For File Safe this is not a supported feature. Therefore, remove it from the table. |
Org Name | Company Name, Bank Name, Remove Org Suffix , All Caps | Anonymizes Organization Name to either Company Name or Bank Name based on the utility condition selected. Maintains consistency during the anonymisation. |
Material Name | Fixed Mat, Random Mat, All Caps | Anonymizes the Material Name different formats based on the selected condition. |
IBAN | N/A | Anonymizes the IBAN to an IBAN of the same country while retaining the length and the pattern. |
Account Number | N/A | Anonymizes the Account Number using format preserving encryption. Retaining the length and numeric pattern. |
Condition-Specific Details:β
-
Email Conditions:
- Dummy Domain: Replaces the email domain (e.g.,
example.com
). - Keep Domain: Retains the original domain while anonymizing the rest of the email.
- All Caps: Anonymizes the complete email and converts the entire email to uppercase.
- Dummy Domain: Replaces the email domain (e.g.,
-
Name Conditions:
- First Name: Anonymizes only the first name field.
- Last Name: Anonymizes only the last name field.
- All Caps: Anonymizes the name and converts the name to uppercase
- Full Name: Anonymizes the entire full name as a single unit.
-
Date Conditions:
- Same Year: Keeps the year consistent across all records.
- Random: Randomizes the entire date.
- Adult: Ensures the date reflects an adult age.
- Consistent: Keeps the date consistent across records.
-
Phone Conditions:
- Remove_Country_Code: Removes the country code but generates a consistent anonymized value for the remaining phone number.
- Random: Anonymizes the complete phone number.
- Consistent: Randomizes the first occurrence of a phone number and then maintains consistency for the rest.
- Remove Country Code: Removes the first 4 digits from the original phone number and generates a consistent anonymized value for the remaining number.
-
Org Name Conditions:
- The default condition (if user doesnβt select any value) for an Org Name will be always Company Name.
- Bank Name: Replaces the Organization Name with an anonymized Bank Name. Also maintains consistency during the anonymisation process.
- Company Name: Replaces the Organization Name with an anonymized Company Name. Also maintaining consistency during the anonymisation process will, however, retain any suffix of the company name at the end.
- Remove Org Prefix: Anonymizes the Organization Name to a Company Name and removes the Suffix. The list of identified suffixes by the application is listed below. Users will not be allowed to select a Bank Name and a Remove Org Prefix combination.
- All Caps: Anonymizes the Org Name to Company Name and capitalizes it.
-
Material Name Conditions:
-
The default condition (if user doesnβt select any value) for a Material Name will be always RANDOM_MAT.
-
RANDOM_MAT: Replaces the Material Name with a
MATERIAL_SEQUENCE_NO
. Also maintains consistency across the data. Example: βInflammable Liquidβ βMATERIAL_1
-
FIXED_MAT: Replaces the Material Name with an anonymized material name. Also maintains consistency across the data. Example: βInflammable Liquidβ β
RUBBER_1
-
All Caps: Anonymizes the Material Name using the selected method and capitalizes it.
-
πΎ Step 6: Run the File Anonymizationβ
- Once the Utility Parameters and Utility Conditions have been configured, proceed to run the anonymization process.
- Click on Run to execute the File Safe Run.
Post-Submission:
- After clicking Run, the user will be redirected to the File Safe Tab on the Jobs Page.
- On the Jobs Page, users can track the status and progress of the anonymization job.
β
π Step 7: Monitor Job Statusβ
Navigate to the Jobs Pag by clicking History on the left navigation panel. User can check the progress of the file safe anonymization job:
- Use the Refresh button in the top right corner to view the updated status.
- The job status can be:
Not Started
In Progress
Finished
Failed
- Users can also view:
- Values Processed β the number of successfully anonymized records.
- Records Failed β the number of failed records.
- Once the File Anonymization is in the Finished state, click on the Download icon under the Actions column to download the anonymized file.
- Click on the corresponding Run Name in the Jobs Page to view the detailed history of the File Safe Job.
- The detailed history shows:
- Utility Parameters
- Utility Conditions
- Privacy Relevance
π― Conclusionβ
By following these steps, you can securely anonymize your data with ease. If you encounter any difficulties, our support team is always available to assist.