Data Privacy Using Anonymization Method on Open Data

Authors

  • Thi Minh Chau Le Ho Chi Minh City University of Technology and Education, Vietnam https://orcid.org/0009-0004-8372-9098
  • Tran Thi Van Nguyen Ho Chi Minh City University of Technology and Education, Vietnam

Corressponding author's email:

chaultm@hcmute.edu.vn

DOI:

https://doi.org/10.54644/jte.2024.1472

Keywords:

Open Data, Anonymity, Privacy, k - anonymity, ℓ - diversity

Abstract

Open Data is a type of data shared between organizations, agencies, businesses, governments, etc. It is mainly used to serve community projects in many fields: health, environment, education, etc. Nowadays, countries around the world are following the trend of building smart cities and smart governments. They are applying Open Data in these projects and achieving many significant benefits. However, sharing data can lead to many problems. In recent studies, many authors have pointed out that besides the benefits that Open Data offers, there are also risks in terms of security, including revealing information of individuals, organizations, and businesses. Data security using anonymization methods such as k-anonymity or l-diversity has been researched and applied for many years. However, these methods are just mainly implemented and tested on traditional data sets of businesses and organizations, not the data on Open Data. Therefore, this topic will focus on understanding Open Data, data security methods based on anonymization mechanism, implementing some security methods based on anonymization mechanism on Open Data and analyzing and evaluating research results.

Downloads: 0

Download data is not yet available.

Author Biographies

Thi Minh Chau Le, Ho Chi Minh City University of Technology and Education, Vietnam

Lê Thị Minh Châu graduated from university in Infomatics in 2005 at the Open University and received a master's degree in computer science in 2012 at Ho Chi Minh City University of Technology. I am currently working at the Faculty of Information Technology, HCMC University of Technology and Education. My research interests include security, privacy, Big Data AI, and related technologies. My email: chaultm@hcmute.edu.vn. ORCID:  https://orcid.org/0009-0004-8372-9098

Tran Thi Van Nguyen, Ho Chi Minh City University of Technology and Education, Vietnam

Nguyễn Trần Thi Văn got my BS degree in Information Technology in 2002 at the University of Natural Sciences – National University Hochiminh City and finished my Masters program in Computer Sciences in 2015 at the University of Information Technology – National University Hochiminh City. I have been working as a Lecturer at the Falculty of Information Technology, University of Technology and Education Hochiminh City since 2003. My research interests include software engineering, mobile application development, data mining, social network analysis and machine learning. My email: nttvan@hcmute.edu.vn

References

A. Goben and R. J. Sandusky, “Open Data Repositories Current Risks and opportunities,” C&RL News, 2020, doi: 10.5860/crln.81.1.62. DOI: https://doi.org/10.5860/crln.81.1.62

N. Patoulias, “Political risks analysis using open data,” Open University Cyprus, 2019. [Online]. Available: http://hdl.handle.net/11128/4185.

R. Bild, K. A. Kuhn, and F. Prasser, “SafePub: A Truthful Data Anonymization Algorithm With Strong Privacy Guarantees,” Proceedings on Privacy Enhancing Technologies, 2018, doi: 0.1515/popets-2018-0004. DOI: https://doi.org/10.1515/popets-2018-0004

L. T. Hieu and D. T. Khanh, “An Elastic Anonymization Framework for Open Data,” FDSE 2020, CCIS 1306, pp. 108–119, 2020, doi: 10.1007/978-981-33-4370-2_8. DOI: https://doi.org/10.1007/978-981-33-4370-2_8

F. Z. Borgesius, J. Gray, and M. V. Eechoud, “Open Data, Privacy, And Fair Information Principles: Towards A Balancing Framework,” Berkeley Technology Law Journal, vol. 30, no. 3, pp. 2073-2131, 2015, doi: 10.15779/Z389S18.

L. Sweeney, “K-anonymity: a model for protecting privacy,” International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, vol. 10, no. 5, 2002, doi: 10.1142/S0218488502001648. DOI: https://doi.org/10.1142/S0218488502001648

A. Machanavajjhala, J. Gehrke, and D. Kifer, “ℓ-Diversity: Privacy Beyond k-Anonymity,” in 22nd International Conference on Data Engineering (ICDE'06), doi: 10.1109/ICDE.2006.1. DOI: https://doi.org/10.1109/ICDE.2006.1

B. Becker and R. Kohavi. UCI Machine Learning Repository, doi: 10.24432/C5XW20.

American Community Survey Main - U.S. Census Bureau. Accessed: Oct. 01, 2015. [Online]. Available: http://www.census.gov/acs/www/.

Integrated Health Interview Series, doi: 10.18128/D070.V6.4.

American Time Use Survey. Accessed: 2019. [Online]. Available: https://www.bls.gov/tus/data/datafiles-2019.htm.

V. S. Iyengar et al., “Transforming Data to Satisfy Privacy Constraints,” in Proceedings of the eighth ACM SIGKDD international conference on Knowledge Discovery and Data Mining, 2022, doi: 10.1145/775047.775089. DOI: https://doi.org/10.1145/775047.775089

L. Sweeney et al., “Achieving k-anonymity privacy protection using generalization and suppression,” vol. 10, no. 05, pp. 571-588, 2002, doi: 10.1142/S021848850200165X. DOI: https://doi.org/10.1142/S021848850200165X

A. Gionis and T. Tassa, “k-Anonymization with Minimal Loss of Information,” IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 2, 2009, doi: 10.1109/TKDE.2008.129. DOI: https://doi.org/10.1109/TKDE.2008.129

K. LeFevre, D. J. DeWitt, and R. Ramakrishnan, “Mondrian Multidimensional K-Anonymity,” in 22nd International Conference on Data Engineering (ICDE'06), 2006, doi: 10.1109/ICDE.2006.101. DOI: https://doi.org/10.1109/ICDE.2006.101

R. J. Bayardo and R. Agrawa, “Data privacy through optimal k-anonymization,” in 21st International Conference on Data Engineering (ICDE'05), 2005, doi: 10.1109/ICDE.2005.42. DOI: https://doi.org/10.1109/ICDE.2005.42

K. E. Emam et al., “A Globally Optimal k-Anonymity Method for the De-Identification of Health Data,” Journal of the American Medical Informatics Association, vol. 16, no. 5, pp. 670–682, 2009, doi: 10.1197/jamia.M3144. DOI: https://doi.org/10.1197/jamia.M3144

J. Goldberger and T. Tassa, “Efficient Anonymizations with Enhanced Utility,” Transactions on Data Privacy, vol. 3, pp. 149–175, 2010, doi: 10.1109/ICDMW.2009.15. DOI: https://doi.org/10.1109/ICDMW.2009.15

Z. Wan and Y. Vorobeychik, “A Game Theoretic Framework for Analyzing Re-Identification Risk,” 2015, doi: 10.1371/journal.pone.0120592. DOI: https://doi.org/10.1371/journal.pone.0120592

Published

28-04-2024

How to Cite

[1]
Lê Thị Minh Châu and Nguyễn Trần Thi Văn, “Data Privacy Using Anonymization Method on Open Data”, JTE, vol. 19, no. 02, pp. 12–21, Apr. 2024.

Issue

Section

Research Article

Categories