通信学报 ›› 2022, Vol. 43 ›› Issue (3): 30-41.doi: 10.11959/j.issn.1000-436x.2022043

• 学术论文 • 上一篇    下一篇

可重构的素域SM2算法优化方法

李斌1, 周清雷1, 陈晓杰2, 冯峰1   

  1. 1 郑州大学计算机与人工智能学院,河南 郑州 450001
    2 数学工程与先进计算国家重点实验室,河南 郑州 450001
  • 修回日期:2022-02-08 出版日期:2022-03-25 发布日期:2022-03-01
  • 作者简介:李斌(1986- ),男,河南郑州人,博士,郑州大学讲师,主要研究方向为信息安全、可重构计算
    周清雷(1962- ),男,河南新乡人,博士,郑州大学教授,主要研究方向为信息安全、自动机理论和计算复杂性理论
    陈晓杰(1993- ),男,河南武陟人,数学工程与先进计算国家重点实验室博士生,主要研究方向为信息安全、可重构计算
    冯峰(1990- ),男,河南新乡人,郑州大学博士生,主要研究方向为信息安全
  • 基金资助:
    国家重点研发计划基金资助项目(2016YFB0800100);国家重点研发计划基金资助项目(2016YFB0800101);国家自然科学基金资助项目(61572444)

Optimization of reconfigurable SM2 algorithm over prime filed

Bin LI1, Qinglei ZHOU1, Xiaojie CHEN2, Feng FENG1   

  1. 1 School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China
    2 State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China
  • Revised:2022-02-08 Online:2022-03-25 Published:2022-03-01
  • Supported by:
    The National Key Research and Development Program of China(2016YFB0800100);The National Key Research and Development Program of China(2016YFB0800101);The National Natural Science Foundation of China(61572444)

摘要:

针对SM2算法软件效率低、硬件实现资源利用率低、可扩展性差的问题,提出了一种可重构的素域SM2算法优化方法。通过对SM2算法的深入分析,从不同计算阶段和计算特点着手,分别采用KOA快速乘法、快速模约减和Barrett算法实现推荐或任意参数的模乘运算,并优化改进基为4的扩展欧几里得算法加速模逆运算。然后,在标准射影坐标系下以蒙哥马利方法提高点乘运算效率,并优化了点加和倍点数据流,将运算周期缩短至12个时钟。同时,在FPGA内部实现了快速的坐标系转换。最后,设计实现了多SM2的并行调度管理,满足日益多样化的应用需求。实验结果分析表明,所优化的SM2充分利用了FPGA的资源,缩短了点乘周期,每秒计算次数最多较CPU(Intel i5-8300)高352.48倍,提高了计算性能和可扩展性。

关键词: 可重构, SM2, FPGA, 蒙哥马利点乘, 快速模乘

Abstract:

Aiming at the problems of inefficient of software, low utilization of hardware resources and poor scalability of SM2 algorithm, a reconfigurable optimization method of SM2 algorithm over prime filed was proposed.Through in-depth analysis of the SM2 algorithm, starting from different computation stages and characteristics, respectively using KOA fast multiplication, fast modular reduction and Barrett algorithm to achieve recommended or arbitrary parameters of the modular multiplication operation, and the radix-4 extended Euclidean algorithm was optimized and improved to accelerate the modular inverse operation.Then, in the standard projective coordinate system, the Montgomery method was used to improve the efficiency of point multiplication, and the data flow of point addition and double point was optimized to shorten the operation cycle to 12 clocks.At the same time, fast coordinate system conversion was realized inside the FPGA.Finally, the parallel scheduling management of multi-SM2 was designed and implemented to meet the computational requirements of multiple applications.The experimental results show that the optimized SM2 makes full use of FPGA resources and shortens the cycle of point multiplication.The maximum number of calculations per second is 352.48 times higher than the CPU (Intel i5-8300), which improves the performance and scalability.

Key words: reconfigurable, SM2, FPGA, Montgomery point multiplication, fast modular multiplication

中图分类号: 

No Suggested Reading articles found!