基于轨迹位置形状相似性的隐私保护算法

doi:10.11959/j.issn.1000-436x.2015043

摘要/Abstract

摘要：

为了降低轨迹数据发布产生的隐私泄露风险，提出了多种轨迹匿名算法。然而，现有的轨迹匿名算法在计算轨迹相似性时忽略了轨迹的形状因素对轨迹相似性的影响，因此产生的匿名轨迹集合的可用性相对较低。针对这一问题，提出了一种新的轨迹相似性度量模型，在考虑轨迹的时间和空间要素的同时，加入了轨迹的形状因素，可以在多项式时间内计算定义在不同时间跨度上的轨迹的距离，能够更加准确、快速地度量轨迹之间的相似性；在此基础上，提出了一种基于轨迹位置形状相似性的隐私保护算法，最大限度地提高了聚类内部轨迹的相似性，并且使用真实的原始位置信息形成数据“面罩”，满足了轨迹k-匿名，在有效地保护轨迹数据的同时，提高了轨迹数据的可用性；最后，在合成轨迹数据集和真实轨迹数据集上的实验结果表明，本算法花费更少的时间代价，具有更高的数据可用性。

关键词: 时空轨迹数据, 轨迹数据发布, 贪婪聚类, 数据面罩, 轨迹匿名

Abstract:

In order to reduce the privacy disclosure risks when trajectory data is released,a variety of trajectories anonymity methods were proposed.However,while calculating similarity of trajectories,the existing methods ignore the impact that the shape factor of trajectory has on similarity of trajectories,and therefore the produced set of trajectory anonymity has a lower utility.To solve this problem,a trajectory similarity measure model was presented,considered not only the time and space elements of the trajectory,but also the shape factor of trajectory.It is computable in polynomial time,and can calculate the distance of trajectories not defined over the same time span.On this basis,a greedy clustering and data mask based trajectory anonymization algorithm was presented,which maximized the trajectory similarity in the clusters,and formed data ＂mask＂ which is formed by fully accurate true original locations information to meet the trajectory k-anonymity.Finally,experimental results on a synthetic data set and a real-life data set were presented; our method offer better utility and cost less time than comparable previous proposals in the literature.

Key words: spatio-tempporal trajectory data, publication of trajectory data, greedy clustering, data mask

王超,杨静,张健沛. 基于轨迹位置形状相似性的隐私保护算法[J]. 通信学报, 2015, 36(2): 144-157.

Chao WANG,Jing YANG,Jian-pei ZHANG. Privacy preserving algorithm based on trajectory location and shape similarity[J]. Journal on Communications, 2015, 36(2): 144-157.

图/表 24

图1

图2

图3

图4

图5

图6

图7

图8

图9

图10

图11

图12

图13

图14

图15

图16

图17

图18

图19

图20

图21

图22

图23

图24

参考文献 20

1	韩建民, 于娟, 虞慧群等. 面向数值型敏感属性的分级 l-多样性模型[J]. 计算机研究与发展, 2011,48(1):147-158. HAN J M , YU J , YU H Q , et al. A multi-level l-diversity model for numerical sensitive attributes[J]. Journal of Computer Research and Development, 2011,48(1):147-158.
2	韩建民, 岑婷婷, 虞慧群 . 数据表 k-匿名化的微聚集算法研究[J]. 电子学报, 2008,36(11):2021-2029. HAN J M , CEN T T , YU H Q . Research in micro aggregation algorithm for k-anonymization[J]. Chinese Journal of Electronics, 2008,36(11):2021-2029.
3	杨高明, 杨静, 张健沛 . 半监督聚类的匿名数据发布[J]. 电子学报, 2011,32(11):1489-1494. YANG G M , YANG J , ZHANG J P . Semi-supervised clustering-based anonymous data publishing[J]. Chinese Journal of Electronics, 2011,32(11):1489-1494.
4	杨静, 王波 . 一种基于最小选择度优先的多敏感属性个性化 l-多样性算法[J]. 计算机研究与发展, 2012,49(9):2603-2610. YANG J , WANG B . Personalized l-diversity algorithm for multiple sensetive attributes based on minimum selected degree first[J]. Journal of Computer Research and Development, 2012,49(9):2603-2610.
5	王波, 杨静 . 一种基于逆聚类的个性化隐私匿名方法[J]. 电子学报, 2012,40(5):883-890. WANG B , YANG J . A personalized privacy anonymous method based on inverse clustering[J]. Chinese Journal of Electronics, 2012,40(5):883-890.
6	周水庚, 李丰, 陶宇飞等. 面向数据库应用的隐私保护研究综述[J]. 计算机学报, 2009,32(5):847-861. ZHOU S G , LI F , TAO Y F , et al. Privacy preservation in database applications:a survey[J]. Chinese Journal of Computers, 2009,32(5):847-861.
7	熊平, 朱天清 . 基于杂度增益与层次聚类的数据匿名方法[J]. 计算机研究与发展， 2012,49(7):1545-1552. XIONG P , ZHU T Q . A data anonymization approach based on impurity gain and hierarchical clustering[J]. Journal of Computer Research and Development, 2012,49(7):1545-1552.
8	SAMARATI P , SWEENEY L . Protecting privacy when disclosing information:k-anonymity and its enforcement through generalization and suppression[A]. Proceedings of the IEEE Symposium on Research in Security and Privacy[C]. Paloalto,CA:IEEE, 1998.1-19.
9	SWEENEY L . k-anonymity:a model for protecting privacy[J]. International Journal on Uncertainty,Fuzziness and Knowledge-Based Systems, 2002,10(5):557-570.
10	DOMINGO-FERRER J , SRAMKA M , TRUJILLO-RASúA R . Privacy-preserving publication of trajectories using microaggregation[A]. Proceedings of the SIGSPATIAL ACM GIS 2010 International Workshop on Security and Privacy in GIS and LBS,SPRINGL 2010[C]. San Jose,California,USA,ACM, 2010.
11	袁冠, 夏士雄, 张磊等. 基于结构相似度的轨迹聚类算法[J]. 通信学报, 2011,32(9):103-110. YUAN G , XIA S X , ZHANG L , et al. Trajectory clustering algorithm based on structural similarity[J]. Journal on Communications, 2011,32(9):103-110.
12	ABUL O , BONCHI F , NANNI M . Never walk alone:uncertainty for anonymity in moving objects databases[A]. Proceedings of the IEEE International Conference on Data Engineering[C]. Cancun:IEEE, 2008.376-385.
13	ABUL O , BONCHI F , NANNI M . Anonymization of moving objects databases by clustering and perturbation[J]. Information Systems, 2010,35(8):884-910.
14	NERGIZ M E , ATZORI M , SAYGIN Y , et al. Towards trajectory anonymization:a generalization-based approach[J]. Transactions on Data Privacy, 2009,2(1):47-75.
15	NERGIZ M E , ATZORI M , SAYGIN Y . Towards trajectory anonymization:a generalization-based approach[A]. Proceedings of the SIGSPATIAL ACM GIS 2008 International Workshop on Security and Privacy in GIS and LBS[C]. California,USA,ACM, 2008.52-61.
16	HUO Z , HUANG Y , MENG X . History trajectory privacy-preserving through graph partition[A]. Proceedings of the First International Workshop on Mobile Location-Based Service[C]. Beijing:China,ACM, 2011.71-78.
17	HUO Z , MENG X , HU H , et al. You can walk alone:trajectory privacy-preserving through significant stays protection[A]. Proceedings of the 17th International Conference on Database Systems for Advanced Applications (DASFAA2012)[C]. Busan,South Korea, 2012.351-366.
18	JOSEP D F , ROLANDO T R . Micro aggregation and permutation-based anonymization of movement data[J]. Information Sciences, 2012,208:55-80.
19	FLOYD R W . Algorithm 97:shortest path[J]. Communications of the ACM, 1962,5(6):345-350.
20	PIORKOWSKI M , SARAFIJANOVOC-DJUKIC N , GROSSGLAUSER M . A parsimonious model of mobile partitioned networks with clustering[A]. The First International Conference on Communication Systems and Network (COMSNETS)[C]. Bangalore,India, 2009.