通信学报 ›› 2022, Vol. 43 ›› Issue (3): 30-41.doi: 10.11959/j.issn.1000-436x.2022043
李斌1, 周清雷1, 陈晓杰2, 冯峰1
修回日期:
2022-02-08
出版日期:
2022-03-25
发布日期:
2022-03-01
作者简介:
李斌(1986- ),男,河南郑州人,博士,郑州大学讲师,主要研究方向为信息安全、可重构计算基金资助:
Bin LI1, Qinglei ZHOU1, Xiaojie CHEN2, Feng FENG1
Revised:
2022-02-08
Online:
2022-03-25
Published:
2022-03-01
Supported by:
摘要:
针对SM2算法软件效率低、硬件实现资源利用率低、可扩展性差的问题,提出了一种可重构的素域SM2算法优化方法。通过对SM2算法的深入分析,从不同计算阶段和计算特点着手,分别采用KOA快速乘法、快速模约减和Barrett算法实现推荐或任意参数的模乘运算,并优化改进基为4的扩展欧几里得算法加速模逆运算。然后,在标准射影坐标系下以蒙哥马利方法提高点乘运算效率,并优化了点加和倍点数据流,将运算周期缩短至12个时钟。同时,在FPGA内部实现了快速的坐标系转换。最后,设计实现了多SM2的并行调度管理,满足日益多样化的应用需求。实验结果分析表明,所优化的SM2充分利用了FPGA的资源,缩短了点乘周期,每秒计算次数最多较CPU(Intel i5-8300)高352.48倍,提高了计算性能和可扩展性。
中图分类号:
李斌, 周清雷, 陈晓杰, 冯峰. 可重构的素域SM2算法优化方法[J]. 通信学报, 2022, 43(3): 30-41.
Bin LI, Qinglei ZHOU, Xiaojie CHEN, Feng FENG. Optimization of reconfigurable SM2 algorithm over prime filed[J]. Journal on Communications, 2022, 43(3): 30-41.
表1
点加计算过程和数据流"
时钟 | KOA乘法 | 快速模约减 | 中间变量 | 计算结果 |
1 | T1=X1Z2 | — | T6=2b | — |
2 | T2=X2Z1 | T 1 Mod PSM2 | T6=4b | — |
3 | T3=Z1Z2 | T 2 Mod PSM2 | — | X1Z2 |
4 | T4=X1X2 | T 3 Mod PSM2 | T1=T1–T2 | X2Z1 |
5 | T5=aT3 | T 4 Mod PSM2 | T2=T1+T2 | Z1Z2, X1Z2–X2Z1 |
6 | T6=T6T3 | T 5 Mod PSM2 | — | X1X2, X1Z2+X2Z1 |
7 | T1=T12 | T 6 Mod PSM2 | T4=T4–T5 | aZ1Z2 |
8 | T2=T6T2 | T 1 Mod PSM2 | — | 4bZ1Z2,X1X2–aZ1Z2 |
9 | T4=T42 | T 2 Mod PSM2 | — | (X1Z2–X2Z1) 2 |
10 | T1=xGT1 | T 4 Mod PSM2 | — | 4bZ1Z2(X1Z2+X2Z1) |
11 | — | T1 Mod PSM2 | T4=T4–T2 | (X1X2–aZ1Z2)2 |
12 | — | — | — | xG(X1Z2–X2Z1)2, |
(X1X2–aZ1Z2)2– | ||||
4bZ1Z2(X1Z2+X2Z1) |
表2
倍点计算过程和数据流"
时钟 | KOA乘法 | 快速模约减 | 中间变量 | 计算结果 |
1 | T1=Z12 | — | T4=2b | — |
2 | T2=X12 | T 1 Mod PSM2 | T4=4b | — |
3 | T3=aT1 | T 2 Mod PSM2 | — | Z12 |
4 | T4=T4T1 | T 3 Mod PSM2 | — | X12 |
5 | T5=X1Z1 | T 4 Mod PSM2 | T2=T2–T3 | aZ 12 |
6 | T1=T1T4 | T 5 Mod PSM2 | T3=T2+T3 | 4 bZ12,X12–aZ12 |
7 | T4=T5T4 | T 1 Mod PSM2 | T3=2T3 | X1 Z 1,X12+aZ12 |
8 | T2=T22 | T 4 Mod PSM2 | T3=2T3 | 4 bZ14,2(X12+aZ12) |
9 | T3=T5T3 | T 2 Mod PSM2 | T4=2T4 | 4 bX1 Z13,4(X12+aZ12) |
10 | — | T3 Mod PSM2 | T2=T2–T4 | (X12–aZ12)2,8 bX1Z 13 |
11 | — | — | T1=T1+T3 | 4 X1 Z 1(X12+aZ12), |
(X12–aZ12)2–8 bX1Z 13 | ||||
12 | — | — | — | 4 X1 Z 1(X12+aZ12)+4 bZ 14 |
表6
本文方案与其他硬件方案点乘性能对比"
方案 | 器件 | 参数类型 | 时钟频率/MHz | 点乘性能(次·秒-1) | 运算时钟周期数 |
文献[ | Virtex-7 | 任意 | 225.0 | 671 | 335 360 |
文献[ | ASIC 65 nm | 任意 | 546.5 | 1 375 | 397 300 |
文献[ | Virtex-7 | 任意 | 90.7 | 1 378 | 65 783 |
文献[ | Virtex-7 | 任意 | 177.7 | 676 | 262 650 |
文献[ | ASIC 0.13 μm | 推荐 | 163.7 | 49 105 | 3 333 |
文献[ | ASIC 0.13 μm | 推荐 | 214.0 | 45 147 | 4 740 |
文献[ | Virtex-6 | 推荐 | 166.0 | 10 807 | 15 360 |
文献[ | XC7Z020 | 推荐 | 268.1 | 2 173 | 123 187 |
本文方案 | Xcku-060 | 任意推荐 | 27.027.0 | 9988 812 | 27 0283 064 |
表7
本文方案与其他FPGA方案在安全性和AT值方面的对比"
方案 | 器件 | 参数类型 | 抵抗SPA | 面积/kslice | 时间/ms | AT |
文献[ | Virtex-7 | 任意 | 否 | 1.70 | 1.49 | 2.53 |
文献[ | Virtex-7 | 任意 | 否 | 19.33 | 0.73 | 14.11 |
文献[ | Virtex-7 | 任意 | 是 | 8.90 | 1.48 | 13.17 |
文献[ | Virtex-6 | 推荐 | 是 | 6.78 | 0.09 | 0.61 |
文献[ | XC7Z020 | 推荐 | 是 | 2.02 | 0.46 | 0.93 |
本文方案 | Xcku-060 | 任意 | 是 | 4.68 | 1.00 | 4.68 |
推荐 | 是 | 4.80 | 0.11 | 0.53 |
表8
本文方案与其他FPGA实现的SM2算法综合对比"
方案 | 器件 | 抵抗SPA | 时钟频率/MHz | 性能/(次·秒-1) | 面积/ kslice | 时间/ms | AT | 本文性能提高倍数 | 本文AT提高倍数 |
文献[ | StratixII | 是 | 62.3 | 1 298 | 4.74 | 0.77 | 3.65 | 6.79 | 6.91 |
文献[ | xc6vlx760 | 否 | 38.0 | 2 703 | 6.91 | 0.37 | 2.56 | 3.26 | 4.84 |
文献[ | XC7V585-3 | 是 | 244.0 | 1 645 | 5.35 | 0.61 | 3.26 | 5.36 | 6.18 |
文献[ | ZYNQ ZC706 | 是 | 110.0 | 444 | 7.15 | 2.25 | 16.09 | 19.85 | 30.47 |
文献[ | xc6vlx760 | 否 | 38.4 | 2 703 | 10.06 | 0.37 | 3.72 | 3.26 | 7.05 |
本文方案 | Xcku-060 | 是 | 27.0 | 8 812 | 4.80 | 0.11 | 0.53 | — | — |
[1] | JAVEED K , WANG X J . Radix-4 and radix-8 booth encoded interleaved modular multipliers over general Fp[C]// Proceedings of 2014 24th International Conference on Field Programmable Logic and Applications (FPL). Piscataway:IEEE Press, 2014: 1-6. |
[2] | AMIET D , CURIGER A , ZBINDEN P . Flexible FPGA-based architectures for curve point multiplication over GF(p)[C]// Proceedings of 2016 Euromicro Conference on Digital System Design (DSD). Piscataway:IEEE Press, 2016: 107-114. |
[3] | FENG X , LI S G . A high-speed and SPA-resistant implementation of ECC point multiplication over GF(p)[C]// Proceedings of 2017 IEEE Trustcom/BigDataSE/ICESS. Piscataway:IEEE Press, 2017: 255-260. |
[4] | ZHAO Z W , BAI G Q . Ultra high-speed SM2 ASIC implementation[C]// Proceedings of 2014 IEEE 13th International Conference on Trust,Security and Privacy in Computing and Communications. Piscataway:IEEE Press, 2014: 182-188. |
[5] | ZHANG D , BAI G Q . Ultra high-performance ASIC implementation of SM2 with power-analysis resistance[C]// Proceedings of 2015 IEEE International Conference on Electron Devices and Solid-State Circuits. Piscataway:IEEE Press, 2015: 523-526. |
[6] | HOSSAIN M S , KONG Y N , SAEEDI E ,et al. High-performance elliptic curve cryptography processor over NIST prime fields[J]. IET Computers & Digital Techniques, 2017,11(1): 33-42. |
[7] | 韩晓薇, 乌力吉, 王蓓蓓 ,等. 抗简单功耗攻击的SM2原子算法[J]. 计算机研究与发展, 2016,53(8): 1850-1856. |
HAN X W , WU L J , WANG B B ,et al. Atomic algorithm against simple power attack of SM2[J]. Journal of Computer Research and Development, 2016,53(8): 1850-1856. | |
[8] | JAVEED K , WANG X J . Low latency flexible FPGA implementation of point multiplication on elliptic curves over GF(p)[J]. International Journal of Circuit Theory and Applications, 2017,45(2): 214-228. |
[9] | RAHMAN M S , HOSSAIN M S , RAHAT E H ,et al. Efficient hardware implementation of 256-bit ECC processor over prime field[C]// Proceedings of 2019 International Conference on Electrical,Computer and Communication Engineering (ECCE). Piscataway:IEEE Press, 2019: 1-6. |
[10] | J?RVINEN K , MIELE A , AZARDERAKHSH R ,et al. FourQ on FPGA:new hardware speed records for elliptic curve cryptography over large prime characteristic fields[C]// International Conference on Cryptographic Hardware and Embedded Systems. Berlin:Springer, 2016: 517-537. |
[11] | LI W , LIU J H , BAI G Q . High-speed implementation of SM2 based on fast modulus inverse algorithm[C]// Proceedings of 2018 China Semiconductor Technology International Conference (CSTIC). Piscataway:IEEE Press, 2018: 1-3. |
[12] | DING J N , LI S G . A reconfigurable high-speed ECC processor over NIST primes[C]// Proceedings of 2017 IEEE Trustcom/ BigDataSE/ICESS. Piscataway:IEEE Press, 2017: 1064-1069. |
[13] | YANG D Y , DAI Z B , LI W ,et al. An efficient ASIC implementation of public key cryptography algorithm SM2 based on module arithmetic logic unit[C]// Proceedings of 2019 IEEE 13th International Conference on ASIC. Piscataway:IEEE Press, 2019: 1-4. |
[14] | HOSSAIN M R , HOSSAIN M S . Efficient FPGA implementation of modular arithmetic for elliptic curve cryptography[C]// Proceedings of 2019 International Conference on Electrical,Computer and Communication Engineering (ECCE). Piscataway:IEEE Press, 2019: 1-6. |
[15] | DING J N , LI S G , GU Z . High-speed ECC processor over NIST prime fields applied with toom-cook multiplication[J]. IEEE Transactions on Circuits and Systems I:Regular Papers, 2019,66(3): 1003-1016. |
[16] | ROY D B , MUKHOPADHYAY D . High-speed implementation of ECC scalar multiplication in GF(p) for generic Montgomery curves[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2019,27(7): 1587-1600. |
[17] | ISLAM M M , HOSSAIN M S , HASAN M K ,et al. FPGA implementation of high-speed area-efficient processor for elliptic curve point multiplication over prime field[J]. IEEE Access, 2019,7: 178811-178826. |
[18] | KHAN S , JAVEED K , SHAH Y A . High-speed FPGA implementation of full-word Montgomery multiplier for ECC applications[J]. Microprocessors and Microsystems, 2018,62: 91-101. |
[19] | ZHANG D , BAI G Q . High-performance implementation of SM2 based on FPGA[C]// Proceedings of 2016 8th IEEE International Conference on Communication Software and Networks. Piscataway:IEEE Press, 2016: 718-722. |
[20] | GARG H K , XIAO H S . New residue arithmetic based barrett algorithms:modular integer computations[J]. IEEE Access, 2016,4: 4882-4890. |
[21] | BRIER E , JOYE M . Weierstra? elliptic curves and side-channel attacks[C]// International Workshop on Public Key Cryptography. Berlin:Springer, 2002: 335-345. |
[22] | JAVEED K , WANG X J . FPGA based high speed SPA resistant elliptic curve scalar multiplier architecture[J]. International Journal of Reconfigurable Computing,2016, 2016:6371403. |
[23] | YU W , WANG K P , LI B ,et al. Montgomery algorithm over a prime field[J]. Chinese Journal of Electronics, 2019,28(1): 39-44. |
[24] | HU X H , ZHENG X , ZHANG S S ,et al. A high-performance elliptic curve cryptographic processor of SM2 over GF(p)[J]. Electronics, 2019,8(4): 431. |
[25] | WU T , YE J H , LU J . Hardware implementation of SM2 ECC protocols on FPGAs[C]// Proceedings of 2021 IEEE 5th Information Technology,Networking,Electronic and Automation Control Conference. Piscataway:IEEE Press, 2021: 33-37. |
[26] | 王腾飞, 张海峰, 许森 . SM2 专用指令协处理器设计与实现[J]. 计算机工程与应用, 2022,58(2): 102-109. |
WANG T F , ZHANG H F , XU S . Design and implementation of SM2 co-processor with specific instructions[J]. Computer Engineering and Applications, 2022,58(2): 102-109. | |
[27] | XIAO Y , LIN W B , ZHAO Y ,et al. A high-speed elliptic curve cryptography processor for teleoperated systems security[J]. Mathematical Problems in Engineering,2021, 2021:6633925. |
[28] | 杨国强, 丁杭超, 邹静 ,等. 基于高性能密码实现的大数据安全方案[J]. 计算机研究与发展, 2019,56(10): 2207-2215. |
YANG G Q , DING H C , ZOU J ,et al. A big data security scheme based on high-performance cryptography implementation[J]. Journal of Computer Research and Development, 2019,56(10): 2207-2215. |
[1] | 曾嵘, 杭潇. 车联网环境下可重构智能反射面辅助无线信道估计算法[J]. 通信学报, 2022, 43(8): 142-150. |
[2] | 郭海燕, 杨震, 邹玉龙, 吕斌, 冯蕴天, 赵玉娟. 基于主被动波束成形联合优化的双RIS辅助抗干扰通信方法[J]. 通信学报, 2022, 43(7): 21-30. |
[3] | 周嵩林, 唐隽文, 刘罗颢, 吴优, 刘长昊, 金一飞, 杨帆, 许慎恒, 李懋坤. 基于电磁表面的阵列天线及应用概述[J]. 通信学报, 2022, 43(12): 13-23. |
[4] | 唐奎, 胡琪, 赵俊明, 陈克, 冯一军. 基于RIS的室内无线通信信号增强系统[J]. 通信学报, 2022, 43(12): 24-31. |
[5] | 刘海霞, 易浩, 马向进, 乐舒瑶, 孔旭东, 马培, 曾宇鑫, 李龙. 基于无源可重构智能超表面的室内无线信号覆盖增强[J]. 通信学报, 2022, 43(12): 32-44. |
[6] | 神显豪, 曾紫玲, 牛少华. 面向异构网络的可重构智能表面辅助资源优化方法[J]. 通信学报, 2022, 43(11): 171-182. |
[7] | 黎赛, 杨亮, 崔琪楣, 于思源. RIS辅助的混合RF/THz系统性能分析[J]. 通信学报, 2022, 43(1): 49-58. |
[8] | 谢绒娜,李晖,史国振,郭云川. 基于属性轻量级可重构的访问控制策略[J]. 通信学报, 2020, 41(2): 112-122. |
[9] | 符小东,李泳成,沈纲祥. 面向弹性光网络的新型光节点升级策略研究[J]. 通信学报, 2018, 39(9): 76-83. |
[10] | 王超,韩笑冬,王睿,周晞,杨琳璐,刘亚萍. 支持网络互连的可重构卫星平台关键技术研究[J]. 通信学报, 2017, 38(Z1): 83-87. |
[11] | 赵丹,文锋,徐鑫,王鹏,陈博. 可重构服务中心网络的服务路径构建机制[J]. 通信学报, 2016, 37(Z1): 147-155. |
[12] | 赵靓,邹宏,张校辉. 基于随机Petri网的虚拟网可生存性模型研究[J]. 通信学报, 2016, 37(3): 71-78. |
[13] | 梁宁宁,兰巨龙,张震. 基于拓扑感知的可重构服务承载网动态重构算法[J]. 通信学报, 2016, 37(2): 73-80. |
[14] | 刘龙军,丁洪伟,柳虔林,刘正纲. 基于FPGA WSN轮询接入控制协议的研究[J]. 通信学报, 2016, 37(10): 181-187. |
[15] | 陈乃金,冯志勇,江建慧. 用于二维RCA跨层数据传输的旁节点无冗余添加算法[J]. 通信学报, 2015, 36(4): 35-51. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|