Algorithm for Masking Data for Increased Data Confidentiality and Researchers’ Use

Technology #14581

Questions about this technology? Ask a Technology Manager

Download Printable PDF

Categories
Researchers
Samuel Shangwu Wu
Shigang Chen
Managed By
Richard Croley
Assistant Director 352-392-8929
Patent Protection
US Patent Pending

Data Masking for Research Usability and Increased Data Security

This algorithm provides data masking for sensitive data while maintaining research usability. A major challenge in scientific research is lack of data availability due to privacy concerns. Data breaches are estimated to cost the United States $5.85 million in 2014. Many current techniques remove the identity of the data providers, but leave the remaining information unencrypted. While other encryption methods are more secure, they make the encrypted data unusable. Researchers at the University of Florida have developed a data masking method that enables the simultaneous use and masking of patients’ sensitive data. This algorithm will enable researchers to make original sensitive data completely hidden from everyone including data collectors, but still allow many commonly used statistical techniques to produce the same results when applied to the masked data as if they were applied to the original data. It can be integrated into existing technologies including mobile devices, data storage, analytical tools, and data exchange systems.

APPLICATIONS

Data protection software allowing researchers to mine for accurate results while maintaining data confidentiality and patient privacy

ADVANTAGES

  • Integrates into current technologies, requiring no additional hardware or software
  • Complies with data privacy requirements, allowing it to be implemented immediately without modification
  • Masks data for data confidentiality while maintaining usability and accuracy for researchers

TECHNOLOGY

This data masking algorithm increases the confidentiality of patients’ information while maintaining the ease of data mining for researchers. The masking is performed in a way that allows many commonly used statistical techniques in medical and social research to produce the same results when applied to the masked data as if they were applied to the original data. The technology integrates matrix encryption, crypto algorithms, cyber-secure protocols, distributed computing, and applied statistical methods for practical privacy-preserving solutions. This approach not only removes patient identifiers, but masks all other data, making original data completely hidden, yet allowing statistical methods to mine such transformed data for correct research results.