通信学报 ›› 2024, Vol. 45 ›› Issue (2): 213-224.doi: 10.11959/j.issn.1000-436x.2024033

• 学术通信 • 上一篇    

基于上下文词预测和窗口压缩编码的数字水印方法

向凌云1,2, 黄明豪1, 张晨凌1, 杨春芳3   

  1. 1 长沙理工大学计算机与通信工程学院,湖南 长沙 410114
    2 长沙理工大学综合交通运输大数据智能处理湖南省重点实验室,湖南 长沙 410114
    3 信息工程大学河南省网络空间态势感知重点实验室,河南 郑州 450001
  • 修回日期:2023-11-19 出版日期:2024-02-01 发布日期:2024-02-01
  • 作者简介:向凌云(1983− ),女,湖南双峰人,博士,长沙理工大学教授、硕士生导师,主要研究方向为信息安全、信息隐藏、数字水印、隐写分析和自然语言处理等
    黄明豪(1999− ),男,湖南邵阳人,长沙理工大学硕士生,主要研究方向为自然语言数字水印和自然语言处理等
    张晨凌(2000− ),男,湖南邵阳人,长沙理工大学硕士生,主要研究方向为自然语言处理等
    杨春芳(1983− ),男,福建莆田人,博士,信息工程大学副教授、博士生导师,主要研究方向为信息隐藏、多媒体智能理解、网络安全等
  • 基金资助:
    国家自然科学基金资助项目(61972057);国家自然科学基金资助项目(61872448);湖南省自然科学基金资助项目(2022JJ30623)

Digital watermarking method based on context word prediction and window compression coding

Lingyun XIANG1,2, Minghao HUANG1, Chenling ZHANG1, Chunfang YANG3   

  1. 1 School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha 410114, China
    2 Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation, Changsha University of Science and Technology, Changsha 410114, China
    3 Henan Key Laboratory of Cyberspace Situation Awareness, Information Engineering University, Zhengzhou 450001, China
  • Revised:2023-11-19 Online:2024-02-01 Published:2024-02-01
  • Supported by:
    The National Natural Science Foundation of China(61972057);The National Natural Science Foundation of China(61872448);The Natural Science Foundation of Hunan Province(2022JJ30623)

摘要:

针对已有自然语言数字水印方法可替换词数量有限以及水印提取效率低的问题,提出了一种基于上下文词预测和窗口压缩编码的数字水印方法。该方法通过神经网络语言模型自动学习原始文本中每个词的上下文语义特征,预测每个词的候选词列表,从而扩充可用于嵌入水印信息的可替换词数量。同时,考虑到不同位置的候选词的替换对句子语义的影响存在差异,该方法以由多个词组成的窗口为单位来嵌入水印信息,并通过词替换前后句子间的相似度来优化水印嵌入时候选词的选择。在此基础上,提出了一种语义无关的窗口压缩编码方法,其根据窗口中词的字符信息对窗口进行水印编码,解决了提取水印信息时对词替换位置的原始上下文的依赖。实验结果表明,所提方法在具有较高嵌入容量和文本质量的前提下,大大提高了水印的提取效率。

关键词: 数字水印, 词替换, 词预测, 水印编码

Abstract:

To address the problems of limited number of substitutable words and low watermark extraction efficiency in the existing natural language digital watermarking methods, a creative method based on context word prediction and window compression coding was proposed.Firstly, the contextual semantic features of each word in the original text were automatically learned through a neural network language model, and then the candidate word set for each word was predicted, thus the number of substitutable words that could be utilized for carrying watermark information was expanded.Meanwhile, considering the difference of the semantic impact caused by the substitutions of candidate words at different positions, the watermark information was embedded into each window containing several words, and the selection of candidate words for watermark embedding was optimized by the similarity between sentences before and after performing word substitutions.Finally, a semantic-independent window compression coding method was proposed, which encoded each window as appointed watermark information in terms of the character information of words contained in the window.So that during watermark extraction, the dependence on the original context at the position of word substitution was eliminated.The experimental results show that the proposed method greatly improves the watermark extraction efficiency with high embedding capacity and text quality.

Key words: digital watermarking, word substitution, word prediction, watermarking coding

中图分类号: 

No Suggested Reading articles found!