Discovering and protecting sensitive data
Although having good governance and working with multiple tools that work with data can help us with sensitive data discovery classification and profiling, more often than not, the data used in our ML experiments comes from outside sources, or maybe we are simply not developing for our own organization. In that case, we need to train ourselves on what sensitive data is and how to do a quick cleanup if we need to use Azure Machine Learning.
Identifying sensitive data
Sensitive data refers to any information that, if exposed, could cause harm, privacy breaches, or lead to identity theft, monetary loss, or other adverse consequences for individuals or organizations. This data requires special protection due to its nature and the potential risks associated with its disclosure.
There are many categories of sensitive data, many of which are outlined ahead, together with examples that we need to be aware of:
- Personally identifiable...