网络与信息安全学报 ›› 2020, Vol. 6 ›› Issue (4): 77-94.doi: 10.11959/j.issn.2096-109x.2020044
袁福祥1,2(),刘粉林1,2,刘翀1,2,刘琰1,2,罗向阳1,2
修回日期:
2020-03-19
出版日期:
2020-08-15
发布日期:
2020-08-13
作者简介:
袁福祥(1991- ),男,山东济宁人,信息工程大学博士生,主要研究方向为网络空间资源测绘与IP定位|刘粉林(1964- ),男,江苏溧阳人,博士,信息工程大学教授、博士生导师,主要研究方向为网络空间安全|刘翀(1994- ),男,辽宁抚顺人,信息工程大学硕士生,主要研究方向为网络空间资源测绘与IP定位|刘琰(1979- ),女,山东济南人,博士,信息工程大学副教授,主要研究方向为网络空间安全|罗向阳(1978- ),男,湖北荆门人,博士,信息工程大学教授、博士生导师,主要研究方向为网络空间安全
基金资助:
Fuxiang YUAN1,2(),Fenlin LIU1,2,Chong LIU1,2,Yan LIU1,2,Xiangyang LUO1,2
Revised:
2020-03-19
Online:
2020-08-15
Published:
2020-08-13
Supported by:
摘要:
为准确高效地对接口 IP 进行别名解析,支撑 IP 定位,提出一种大规模网络别名解析算法(MLAR)。基于别名IP与非别名IP的时延、路径、Whois等的统计差异,设计过滤规则,在解析前排除大量不可能存在别名关系的 IP,提高解析的效率;将别名解析转化为分类,构建时延相似度、路径相似度等四维新颖的特征,用于过滤后可能的别名IP和非别名IP的分类。基于CAIDA百万级样本的实验表明,相比 RadarGun、MIDAR、TreeNET,正确率提高 15.8%、4.8%、5.7%,耗时最多降低 77.8%、65.3%、55.2%;在应用于 IP 定位时,SLG、LENCR、PoPG 这 3 种典型定位方法的失败率降低 65.5%、64.1%、58.1%。
中图分类号:
袁福祥,刘粉林,刘翀,刘琰,罗向阳. MLAR:面向IP定位的大规模网络别名解析[J]. 网络与信息安全学报, 2020, 6(4): 77-94.
Fuxiang YUAN,Fenlin LIU,Chong LIU,Yan LIU,Xiangyang LUO. MLAR:large-scale network alias resolution for IP geolocation[J]. Chinese Journal of Network and Information Security, 2020, 6(4): 77-94.
表1
路径相似程度统计 Table 1 Statistics of path similarity"
IP对 | 类别 | 比例 | 方向:路径中不同路由IP的数量 | 长度:探测路径的跳数差异 | |||||
≤1 | 2 | ≥3 | ≤1 | 2 | ≥3 | ||||
A | 98.1% | 95.2% | 2.9% | 0 | 96.4% | 1.7% | 0 | ||
别名IP对 | B | 1.9% | 1.4% | 0.5% | 0 | 0 | 0 | 1.9% | |
C | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ||
D | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ||
A | 0.4% | 0.1% | 0.3% | 0 | 0.2% | 0.2% | 0 | ||
非别名IP对 | B | 13.1% | 4.3% | 8.8% | 0 | 0 | 0 | 13.1% | |
C | 34.4% | 0 | 0 | 34.4% | 10.9% | 23.5% | 0 | ||
D | 52.1% | 0 | 0 | 52.1% | 0 | 0 | 52.1% |
表5
过滤结果 Table 5 The filtering results"
过滤规则 | 通过各规则,不同地区被过滤掉的两类样本比例 | |||||||
别名IP | 非别名IP | |||||||
北京 | 上海 | 纽约 | 迈阿密 | 北京 | 上海 | 纽约 | 迈阿密 | |
1) | 0 | 0 | 0 | 0 | 22.2% | 18.9% | 24.5% | 26.3% |
2) | 0 | 0 | 0 | 0 | 5.3% | 3.6% | 7.1% | 5.0% |
3) | 0 | 0.0041% | 0 | 0 | 26.6% | 30.6% | 26.9% | 28.9% |
4) | 0 | 0 | 0 | 0.0023% | 20.4% | 21.1% | 22.6% | 23.3% |
5) | 0 | 0 | 0 | 0 | 8.9% | 7.5% | 3.5% | 2.7% |
合计 | 0 | 0.0041% | 0 | 0.0023% | 83.4% | 81.7% | 84.6% | 86.2% |
表6
训练、测试集构造及对应分类结果Table 6 Training set,test set construction and corresponding classification results"
测试序号 | 训练集 | 测试集 | 分类结果 | ||||||
别名IP对 | 非别名IP对 | 别名IP对 | 非别名IP对 | Acc | Ma | Fa | |||
a1 | 96.3% | 2.7% | 4.7% | ||||||
a2 | 95.4% | 4.4% | 4.8% | ||||||
a3 | 96.0% | 3.8% | 4.1% | ||||||
b1 | 96.4% | 3.4% | 3.7% | ||||||
b2 | 95.9% | 4.0% | 4.2% | ||||||
b3 | 96.8% | 3.1% | 3.3% | ||||||
c1 | 96.0% | 3.8% | 4.2% | ||||||
c2 | 96.9% | 2.9% | 3.2% | ||||||
c3 | 96.6% | 3.3% | 3.5% |
表7
不同特征组合的分类效果 Table 7 Classification results of different feature combinations"
测试序号 | 特征组合 | 分类效果 | ||
Acc | Ma | Fa | ||
1 | F1 | 81.1% | 18.7% | 19.0% |
2 | F2 | 84.9% | 14.9% | 15.2% |
3 | F3 | 82.4% | 17.5% | 17.8% |
4 | F4 | 85.3% | 14.3% | 15.0% |
5 | F1、F2 | 89.1% | 10.0% | 11.7% |
6 | F1、F3 | 88.6% | 10.9% | 11.8% |
7 | F1、F4 | 90.5% | 9.0% | 10.0% |
8 | F2、F3 | 87.7% | 12.0% | 12.5% |
9 | F2、F4 | 92.3% | 7.7% | 7.8% |
10 | F3、F4 | 91.5% | 7.8% | 9.1% |
11 | F1、F2、F3 | 93.9% | 6.0% | 6.2% |
12 | F1、F2、F4 | 94.4% | 5.1% | 6.1% |
13 | F1、F3、F4 | 94.3% | 5.3% | 6.0% |
14 | F2、F3、F4 | 95.2% | 4.5% | 5.2% |
15 | F1、F2、F3、F4 | 96.2% | 3.6% | 4.0% |
表8
不同分类算法的分类效果Table 8 Classification results of different algorithms"
数据集来源 | 测试序号 | 分类模型 | 分类效果 | ||
Acc | Ma | Fa | |||
1 | SVM | 96.5% | 3.3% | 3.6% | |
2 | 96.2% | 3.7% | 4.0% | ||
3 | 96.4% | 3.4% | 3.7% | ||
4 | LR | 96.0% | 3.8% | 4.2% | |
CAIDA | 5 | 95.3% | 4.7% | 4.6% | |
6 | 95.7% | 4.2% | 4.4% | ||
7 | NBC | 95.5% | 4.3% | 4.6% | |
8 | 94.9% | 5.0% | 5.3% | ||
9 | 96.0% | 3.8% | 4.1% | ||
10 | SVM | 97.1% | 2.6% | 3.3% | |
11 | 97.5% | 2.3% | 2.8% | ||
12 | 97.1% | 2.8% | 3.0% | ||
13 | LR | 96.8% | 3.2% | 3.2% | |
ISP | 14 | 96.9% | 2.8% | 3.4% | |
15 | 96.6% | 3.5% | 3.4% | ||
16 | NBC | 97.2% | 2.6% | 2.9% | |
17 | 97.3% | 2.4% | 3.1% | ||
18 | 96.7% | 3.0% | 3.6% |
表9
不同方法多次测试结果对比Table 9 Comparison of test results of different methods"
测试序号 | RadarGun | MIDAR | TreeNET | MLAR | |||||||||||
Acc | Ma | Fa | Acc | Ma | Fa | Acc | Ma | Fa | Acc | Ma | Fa | ||||
1 | 84.9% | 17.2% | 13.0% | 92.5% | 7.4% | 7.5% | 90.6% | 9.9% | 9.0% | 96.7% | 2.9% | 3.7% | |||
2 | 80.4% | 18.8% | 20.4% | 91.6% | 8.5% | 8.3% | 89.7% | 9.9% | 10.7% | 95.3% | 4.3% | 5.1% | |||
3 | 76.6% | 23.9% | 22.8% | 90.2% | 9.8% | 9.8% | 90.1% | 10.1% | 9.8% | 96.1% | 3.5% | 4.3% | |||
4 | 84.1% | 15.2% | 16.6% | 89.8% | 9.9% | 10.6% | 91.3% | 8.6% | 8.8% | 95.6% | 4.8% | 3.9% | |||
5 | 87.5% | 13.4% | 11.5% | 93.1% | 7.2% | 6.6% | 91.2% | 8.0% | 9.6% | 95.2% | 4.6% | 5.0% |
表10
不同方法效率对比 Table 10 Efficiency comparison of different methods"
方法 | 随着接口IP数量递增,不同方法3次测试所用时长/h | ||||||||||||||||||
1×106 | 2×106 | 3×106 | 4×106 | 5×106 | |||||||||||||||
1 | 2 | 3 | 1 | 2 | 3 | 1 | 2 | 3 | 1 | 2 | 3 | 1 | 2 | 3 | |||||
RadarGun | 10.4 | 10.2 | 11.7 | 22.1 | 18.9 | 18.6 | 30.1 | 33.5 | 30.0 | 46.9 | 46.0 | 49.3 | 65.7 | 64.9 | 68.7 | ||||
MIDAR | 7.0 | 8.8 | 6.8 | 13.2 | 15.1 | 13.5 | 20.1 | 23.5 | 20.9 | 27.5 | 26.8 | 30.8 | 41.1 | 44.3 | 38.9 | ||||
TreeNET | 4.8 | 5.2 | 6.5 | 9.9 | 10.6 | 11.1 | 14.4 | 15.2 | 17.3 | 23.0 | 22.4 | 24.9 | 31.9 | 32.8 | 34.1 | ||||
MLAR | 3.5 | 3.3 | 3.6 | 5.4 | 5.1 | 4.6 | 8.1 | 7.4 | 6.9 | 11.5 | 9.8 | 10.2 | 15.5 | 16.7 | 14.9 |
表11
定位测试结果对比 Table 11 Comparison of geolocation test results"
别名解析方法 | 在使用不同别名解析方法前后,不同定位算法对两个区域的目标IP的定位失败率 | ||||||
北京(中国) | 加利福尼亚州(美国) | ||||||
SLG[ | LENCR [ | PoPG[ | SLG[ | LENCR[ | PoPG[ | ||
无 | 28.9% | 31.6% | 18.7% | 25.1% | 26.5% | 15.2% | |
RadarGun | 21.6% | 24.4% | 13.2% | 19.3% | 16.4% | 11.9% | |
MIDAR | 13.5% | 16.3% | 9.6% | 15.9% | 13.7% | 9.7% | |
TreeNET | 16.1% | 19.2% | 11.8% | 16.2% | 14.6% | 10.4% | |
MLAR | 9.0% | 10.4% | 6.7% | 9.5% | 10.3% | 7.3% |
[1] | CANBAZ M A . Internet topology mining:from big data to network science[D]. Reno:University of Nevada, 2018. |
[2] | KARDES H , GUNES M H , SARAC K . Graph based induction of unresponsive routers in internet topologies[J]. Computer Networks, 2015,81: 178-200. |
[3] | COSKUN I E , CANBAZ M A , GUNES M H . Efficient AS network topology measurement based on ingress to subnet reachability[C]// IEEE 41st Conference on Local Computer Networks Workshops. 2016: 87-95. |
[4] | WANG Y , BURGENER D , FLORES M ,et al. Towards street-level client-independent IP geolocation[C]// Symposium on Network System Design and Implementation. 2011: 27-27. |
[5] | CHEN J , LIU F , SHI Y ,et al. Towards IP location estimation using the nearest common router[J]. Journal of Internet Technology, 2018,19(7): 2097-2110. |
[6] | YUAN F , LIU F , HUANG D ,et al. A high completeness PoP partition algorithm for IP geolocation[J]. IEEE Access, 2019,7: 28340-28355. |
[7] | KEYS K . Internet-scale IP alias resolution techniques[J]. ACM Sigcomm Computer Communication Review, 2010,40(1): 50-55. |
[8] | MARCHETTA P , PESCAPé A , . DRAGO:detecting,quantifying and locating hidden routers in traceroute IP paths[C]// Proceedings IEEE International Conference on Computer Communications. 2013: 3237-3242. |
[9] | LI R , SUN Y , HU J ,et al. Street-level landmark evaluation based on nearest routers[J]. Security and Communication Networks, 2018(2): 1-12. |
[10] | HINGANT J , ZAMBRANO M , PéREZ F J ,et al. HYBINT:a hybrid intelligence system for critical infrastructures protection[J]. Security and Communication Networks, 2018 |
[11] | 方滨兴 . 从层次角度看网络空间安全技术的覆盖领域[J]. 网络与信息安全学报, 2015,1(1): 2-7. |
FANG B X . A hierarchy model on the research fields of cyberspace security technology[J]. Chinese Journal of Network and Infor-mation Security, 2015,1(1): 2-7. | |
[12] | 赵帆, 罗向阳, 刘粉林 . 网络空间测绘技术研究[J]. 网络与信息安全学报, 2016,2(9): 1-11. |
ZHAO F , LUO X Y , LIU F L . Research on cyberspace surveying and mapping technology[J]. Chinese Journal of Network and In-formation Security, 2016,2(9): 1-11. | |
[13] | 李欲晓, 谢永江 . 世界各国网络安全战略分析与启示[J]. 网络与信息安全学报, 2016,2(1): 1-5. |
LI Y X , XIE Y J . Analysis and enlightenment on the cybersecurity strategy of various countries in the world[J]. Chinese Journal of Network and Information Security, 2016,2(1): 1-5. | |
[14] | 郭莉, 曹亚男, 苏马婧 ,等. 网络空间资源测绘:概念与技术[J]. 信息安全学报, 2018,3(4): 1-14. |
GUO L , CAO Y , SU M J ,et al. Cyberspace resources surveying and mapping:the concepts and technologies[J]. Journal of Cyber security, 2018,3(4): 1-14. | |
[15] | 王松, 张野, 吴亚东 . 网络拓扑结构可视化方法研究与发展[J]. 网络与信息安全学报, 2018,4(2): 1-17. |
WANG S , ZHANG Y , WU Y D . Survey on network topology visu-alization[J]. Chinese Journal of Network and Information Security, 2018,4(2): 1-17. | |
[16] | GOVINDAN R , TANGMUNARUNKIT H . Heuristics for internet map discovery[C]// Proceedings IEEE International Conference on Computer Communications. 2000: 1371-1380. |
[17] | KEYS K . Iffinder,a tool for mapping interfaces to routers[EB]. |
[18] | SPRING N , MAHAJAN R , WETHERALL D . Measuring ISP topologies with rocketfuel[J]. ACM Sigcomm Computer Communication Review, 2002,32(4): 133-145. |
[19] | BENDER A , SHERWOOD R , SPRING N . Fixing ally's growing pains with velocity modeling[C]// Proceedings of the 8th ACM Sigcomm Conference on Internet Measurement. 2008: 337-342. |
[20] | KEYS K , HYUN Y , LUCKIE M ,et al. Internet-scale IPv4 alias resolution with MIDAR[J]. IEEE/ACM Transactions on Networking, 2013,21(2): 383-399. |
[21] | SHERWOOD R , SPRING N . Touring the internet in a TCP sidecar[C]// Proceedings of the 6th ACM Sigcomm Conference on Internet Measurement. 2006: 339-344. |
[22] | SHERRY J , KATZ-BASSETT E , PIMENOVA M ,et al. Resolving IP aliases with prespecified timestamps[C]// Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement. 2010: 172-178. |
[23] | MARCHETTA P , PERSICO V , PESCAPè A . Pythia:yet another active probing technique for alias resolution[C]// Proceedings of the 9th ACM Conference on Emerging Networking Experiments and Technologies. 2013: 229-234. |
[24] | GRAILET J F , DONNET B . Towards a renewed alias resolution with space search reduction and IP fingerprinting[C]// Network Traffic Measurement and Analysis Conference. 2017: 1-9. |
[25] | GUNES M H , SARAC K . Analytical IP alias resolution[C]// IEEE International Conference on Communications. 2006: 459-464. |
[26] | AUGUSTIN B , CUVELLIER X , ORGOGOZO B ,et al. Avoiding traceroute anomalies with paris traceroute[C]// Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement. 2006: 153-158. |
[27] | SPRING N , DONTCHEVA M , RODRIG M ,et al. How to resolve IP aliases[D]. Seattle:University of Washington, 2004. |
[28] | 赵洪华, 白华利, 陈鸣 ,等. 别名解析中的别名过滤技术[J]. 软件学报, 2009 (8): 2280-2288. |
ZHAO H , BAI H L , CHEN M ,et al. Alias filtering technique in alias resolution[J]. Journal of Software, 2009 (8): 2280-2288. | |
[29] | TOZAL M , SARAC K . TraceNET:an internet topology data collector[C]// Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement. 2010: 356-368. |
[30] | PADMANABHAN V N , SUBRAMANIAN L . An investigation of geographic mapping techniques for internet hosts[J]. ACM SIGCOMM Computer Communication Review, 2001,31(4): 173-185. |
[31] | GUEYE B , ZIVIANI A , CROVELLA M ,et al. Constraint-based geolocation of internet hosts[J]. IEEE/ACM Transactions on Networking, 2006,14(6): 1219-1232. |
[32] | SCHAPIRA M , ZHU Y , REXFORD J . Putting BGP on the right path:a case for next-hop routing[C]// Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks. 2010:3. |
[33] | LENCSE G , RéPáS S , . Performance analysis and comparison of different DNS64 implementations for linux,openBSD and freeBSD[C]// IEEE 27th International Conference on Advanced Information Networking and Applications. 2013: 877-884. |
[34] | ZHAO F , LUO X , GAN Y ,et al. IP geolocation based on identification routers and local delay distribution similarity[J]. Concurrency and Computation:Practice and Experience, 2018: 1-15. |
[1] | 夏锐琪, 李曼曼, 陈少真. 基于机器学习的分组密码结构识别[J]. 网络与信息安全学报, 2023, 9(3): 79-89. |
[2] | 施凡, 钟瑶, 薛鹏飞, 许成喜. 基于SSDP和DNS-SD协议的双栈主机发现方法及其安全分析[J]. 网络与信息安全学报, 2023, 9(1): 56-66. |
[3] | 韦南, 殷丽华, 宁洪, 方滨兴. 本科“机器学习”课程教学改革初探[J]. 网络与信息安全学报, 2022, 8(4): 182-189. |
[4] | 黄诚, 孙明旭, 段仁语, 吴苏晟, 陈斌. 面向项目版本差异性的漏洞识别技术研究[J]. 网络与信息安全学报, 2022, 8(1): 52-62. |
[5] | 张颖君,刘尚奇,杨牧,张海霞,黄克振. 基于日志的异常检测技术综述[J]. 网络与信息安全学报, 2020, 6(6): 1-12. |
[6] | 付溪,李晖,赵兴文. 网络钓鱼识别研究综述[J]. 网络与信息安全学报, 2020, 6(5): 1-10. |
[7] | 何康,祝跃飞,刘龙,芦斌,刘彬. 敌对攻击环境下基于移动目标防御的算法稳健性增强方法[J]. 网络与信息安全学报, 2020, 6(4): 67-76. |
[8] | 骆子铭,许书彬,刘晓东. 基于机器学习的TLS恶意加密流量检测方案[J]. 网络与信息安全学报, 2020, 6(1): 77-83. |
[9] | 黄伟,刘存才,祁思博. 针对设备端口链路的LSTM网络流量预测与链路拥塞方案[J]. 网络与信息安全学报, 2019, 5(6): 50-57. |
[10] | 宋蕾, 马春光, 段广晗. 机器学习安全及隐私保护研究进展[J]. 网络与信息安全学报, 2018, 4(8): 1-11. |
[11] | 明拓思宇, 陈鸿昶. 文本摘要研究进展与趋势[J]. 网络与信息安全学报, 2018, 4(6): 1-10. |
[12] | 王正琦,冯晓兵,张驰. 基于两层分类器的恶意网页快速检测系统研究[J]. 网络与信息安全学报, 2017, 3(8): 44-60. |
[13] | 张茜,延志伟,李洪涛,耿光刚. 网络钓鱼欺诈检测技术研究[J]. 网络与信息安全学报, 2017, 3(7): 7-24. |
[14] | 张东,张尧,刘刚,宋桂香. 基于机器学习算法的主机恶意代码检测技术研究[J]. 网络与信息安全学报, 2017, 3(7): 25-32. |
[15] | 孙博文,黄炎裔,温俏琨,田斌,吴鹏,李祺. 基于静态多特征融合的恶意软件分类方法[J]. 网络与信息安全学报, 2017, 3(11): 68-76. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|