基于上下文词预测和窗口压缩编码的数字水印方法

doi:10.11959/j.issn.1000-436x.2024033

Abstract

Abstract:

To address the problems of limited number of substitutable words and low watermark extraction efficiency in the existing natural language digital watermarking methods, a creative method based on context word prediction and window compression coding was proposed.Firstly, the contextual semantic features of each word in the original text were automatically learned through a neural network language model, and then the candidate word set for each word was predicted, thus the number of substitutable words that could be utilized for carrying watermark information was expanded.Meanwhile, considering the difference of the semantic impact caused by the substitutions of candidate words at different positions, the watermark information was embedded into each window containing several words, and the selection of candidate words for watermark embedding was optimized by the similarity between sentences before and after performing word substitutions.Finally, a semantic-independent window compression coding method was proposed, which encoded each window as appointed watermark information in terms of the character information of words contained in the window.So that during watermark extraction, the dependence on the original context at the position of word substitution was eliminated.The experimental results show that the proposed method greatly improves the watermark extraction efficiency with high embedding capacity and text quality.

Key words: digital watermarking, word substitution, word prediction, watermarking coding

CLC Number:

TP309

Lingyun XIANG, Minghao HUANG, Chenling ZHANG, Chunfang YANG. Digital watermarking method based on context word prediction and window compression coding[J]. Journal on Communications, 2024, 45(2): 213-224.

Figures/Tables 7

k	PPL		SS
k	d=1 bit	d=2 bit	d=1 bit	d=2 bit
4	65	80	92.09%	87.98%
5	64	79	92.99%	88.91%
6	$62$	$72$	$93 . 88 %$	$90 . 43 %$

α	PPL			SS
α	k= 4	k= 5	k= 6	k= 4	k= 5	k= 6
0.010	70	68	65	91.77%	92.53%	93.28%
0.015	67	66	63	92.05%	92.96%	93.33%
0.020	$65$	$64$	$62$	$92 . 09 %$	$92 . 99 %$	$93 . 88 %$

方式	k	d=1 bit			d=2 bit
方式	k	PPL	SS	嵌入成功率	PPL	SS	嵌入成功率
	4	65	92.09%	99.40%	80	87.98%	94.74%
固定候选词阈值α=0.02	5	64	92.99%	99.82%	79	88.91%	97.34%
	6	62	$93 . 88 %$	99.97%	72	$90 . 43 %$	98.57%
	4	64	89.98%	$100 %$	75	85.01%	$100 %$
自适应候选词阈值	5	61	90.82%	$100 %$	74	86.06%	$100 %$
	6	$58$	91.80%	$100 %$	70	87.31%	$100 %$

方法	总水印容量	BPW	PPL
文献[14]方法	5 370	0.030 6	$45$
文献[16]方法	26 298	0.150 1	70
文献[17]方法	7 554	0.043 1	47
本文方法（k=6，d=1 bit）	22 748	0.129 9	62
本文方法（k=6，d=2 bit）	$45496$	$0 . 2598$	72

方法	提取效率/(bit·s^-1)	总耗时/s
文献[14]方法	25.57	210
文献[16]方法	232.7	113
文献[17]方法	0.27	27 903
本文方法（k=6，d=1 bit）	$947 . 83$	$24$
本文方法（k=6，d=2 bit）	$1895 . 66$	$24$

References 32

[1]	THONNARD O , BILGE L , KASHYAP A ,et al. Are you at risk? Profiling organizations and individuals subject to targeted attacks[C]// Proceedings of International Conference on Financial Cryptography and Data Security. Berlin:Springer, 2015: 13-31.
[2]	WAN W B , WANG J , ZHANG Y M ,et al. A comprehensive survey on robust image watermarking[J]. Neurocomputing, 2022,488: 226-247.
[3]	LUO X Y , LI Y X , CHANG H W ,et al. DVMark:a deep multiscale framework for video watermarking[J]. IEEE Transactions on Image Processing, 2023,PP(99):1.
[4]	YAMNI M , KARMOUNI H , SAYYOURI M ,et al. Efficient watermarking algorithm for digital audio/speech signal[J]. Digital Signal Processing, 2022,120:103251.
[5]	何路, 桂小林, 田丰 ,等. 自然语言水印鲁棒性分析与评估[J]. 计算机学报, 2012,35(9): 1971-1982.
	HE L , GUI X L , TIAN F ,et al. Analyzing and evaluating the robustness of natural language watermarking[J]. Chinese Journal of Computers, 2012,35(9): 1971-1982.
[6]	XIAO C , ZHANG C , ZHENG C X . FontCode:embedding information in text documents using glyph perturbation[J]. ACM Transactions on Graphics, 2018,37(2): 1-16.
[7]	QI W F , GUO W , ZHANG T ,et al. Robust authentication for paper-based text documents based on text watermarking technology[J]. Mathematical Biosciences and Engineering, 2019,16(4): 2233-2249.
[8]	YANG X , ZHANG W M , FANG H ,et al. Language universal font watermarking with multiple cross-media robustness[J]. Signal Processing, 2023,203:108791.
[9]	NOZAKI J , MURAWAKI Y . Addressing segmentation ambiguity in neural linguistic steganography[J]. arXiv Preprint,arXiv:2211.06662, 2022.
[10]	VAROL A M . LZW-CIE:a high-capacity linguistic steganography based on LZW char index encoding[J]. Neural Computing and Applications, 2022,34(21): 19117-19145.
[11]	MERAL H M , SANKUR B , ?ZSOY A S , ,et al. Natural language watermarking via morphosyntactic alterations[J]. Computer Speech ＆Language, 2009,23(1): 107-125.
[12]	WANG H , SUN X M , LIU Y L ,et al. Natural language watermarking using Chinese syntactic transformations[J]. Information Technology Journal, 2008,7(6): 904-910.
[13]	YANG T Y , WU H Z , YI B ,et al. Semantic-preserving linguistic steganography by pivot translation and semantic-aware bins coding[J]. arXiv Preprint,arXiv:2203.03795, 2022.
[14]	WINSTEIN K . Lexical steganography through adaptive modulation of the word choice hash[R]. 1999.
[15]	BOLSHAKOV I A . A method of linguistic steganography based on collocationally-verified synonymy[C]// Proceedings of International Workshop on Information Hiding. Berlin:Springer, 2004: 180-191.
[16]	UEOKA H , MURAWAKI Y , KUROHASHI S . Frustratingly easy edit-based linguistic steganography with a masked language model[C]// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg:Association for Computational Linguistics, 2021: 5486-5492.
[17]	YANG X , ZHANG J , CHEN K ,et al. Tracing text provenance via context-aware lexical substitution[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press, 2022: 11613-11621.
[18]	武睿峰, 何路, 房鼎益 . 自然语言水印隐蔽性自动评测方法[J]. 计算机应用, 2013,33(12): 3522-3526,3530.
	WU R F , HE L , FANG D Y . Automatic evaluation scheme for imperceptibility of natural language watermarking[J]. Journal of Computer Applications, 2013,33(12): 3522-3526,3530.
[19]	YANG J L , WANG J M , WANG C K ,et al. A novel scheme for watermarking natural language text[C]// Proceedings of the Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing. Piscataway:IEEE Press, 2007: 481-484.
[20]	林建滨, 何路, 李天智 ,等. 一种抗攻击的中文同义词替换文本水印算法[J]. 西北大学学报(自然科学版), 2010,40(3): 433-436.
	LIN J B , HE L , LI T Z ,et al. An anti-attack watermarking based on synonym substitution algorithm for Chinese text[J]. Journal of Northwest University (Natural Science Edition), 2010,40(3): 433-436.
[21]	ZHENG X Y , WU H Z . Autoregressive linguistic steganography based on BERT and consistency coding[J]. Security and Communication Networks, 2022,2022: 1-11.
[22]	ZHENG X Y , FANG Y R , WU H Z . General framework for reversible data hiding in texts based on masked language modeling[J]. arXiv Preprint,arXiv:2206.10112, 2022.
[23]	CHANG C C . Reversible linguistic steganography with Bayesian masked language modeling[J]. IEEE Transactions on Computational Social Systems, 2023,10(2): 714-723.
[24]	杨潇, 李峰, 向凌云 . 基于矩阵编码的同义词替换隐写算法[J]. 小型微型计算机系统, 2015,36(6): 1296-1300.
	YANG X , LI F , XIANG L Y . Synonym substitution-based steganographic algorithm with matrix coding[J]. Journal of Chinese Computer Systems, 2015,36(6): 1296-1300.
[25]	XIANG L Y , WU W S , LI X ,et al. A linguistic steganography based on word indexing compression and candidate selection[J]. Multimedia Tools and Applications, 2018,77(21): 28969-28989.
[26]	YANG Z L , GUO X Q , CHEN Z M ,et al. RNN-stega:linguistic steganography based on recurrent neural networks[J]. IEEE Transactions on Information Forensics and Security, 2019,14(5): 1280-1295.
[27]	YU L , LU Y L , YAN X H ,et al. MTS-Stega:linguistic steganography based on multi-time-step[J]. Entropy, 2022,24(5): 585.
[28]	VASWANI A , SHAZEER N , PARMAR N ,et al. Attention is all you need[J]. arXiv Preprint,arXiv:1706.03762, 2017.
[29]	HILL J , SIMHA R . Automatic generation of context-based fill-in-theblank exercises using co-occurrence likelihoods and Google n-grams[C]// Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications. Stroudsburg:Association for Computational Linguistics, 2016: 23-30.
[30]	FEDUS W , GOODFELLOW I , DAI A M . Maskgan:better text generation via filling in the__[J]. arXiv Preprint,arXiv:1801.07736, 2018.
[31]	ZHU W , HU Z , XING E . Text infilling[J]. arXiv Preprint,arXiv:1901.00158, 2019.
[32]	LIU Y H , OTT M , GOYAL N ,et al. RoBERTa:a robustly optimized BERT pretraining approach[J]. arXiv Preprint,arXiv:1907.11692, 2019.

Metrics

Recommended 0

No Suggested Reading articles found!

Digital watermarking method based on context word prediction and window compression coding

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 7

References 32

Related Articles 15

Metrics

Recommended 0

k/个	窗口总数	嵌入成功率
k/个	窗口总数	d=1 bit	d=2 bit
4	22 748	99.40%	94.74%
5	18 198	99.82%	97.34%
6	15 165	99.97%	98.57%

[1]	Deyang WU, Sen HU, Miaomiao WANG, Haibo JIN, Changbo QU, Yong TANG. Discriminative zero-watermarking algorithm based on region XOR and ternary quantization [J]. Journal on Communications, 2022, 43(2): 208-222.
[2]	Xi YIN,Weiqing HUANG. Research on color QR code watermarking technology based on chaos theory [J]. Journal on Communications, 2018, 39(7): 50-58.
[3]	Wenxian JIANG,Zhenxing ZHANG,Jingjing WU. Reversible digital watermarking-based protocol for data integrity in wireless sensor network [J]. Journal on Communications, 2018, 39(3): 118-127.
[4]	Feng XU,Jia-nan LI,Jian-guo SUN. Research of composite safety protection for digital maps [J]. Journal on Communications, 2016, 37(2): 174-179.
[5]	Ming-zhu LAI,Li-guo ZHANG,Wei-miao FENG,Yuan-yuan WANG,Yong WANG,Shou-zheng LI. Study on authority watermark of the electronic chart based on the semantics characteristics [J]. Journal on Communications, 2016, 37(11): 137-145.
[6]	. Image watermarking algorithm against attacks based on SIFT feature point and cross-ratio value [J]. Journal on Communications, 2014, 35(11): 20-180.
[7]	Tian-yu YE. Perfectly blind image watermarking scheme with multi-purpose based on region segment for sub-block and self-embedding technology [J]. Journal on Communications, 2013, 34(3): 148-156.
[8]	Tian-yu YE. Self-embedding robust digital watermarking algorithm with perfectly blind detection [J]. Journal on Communications, 2012, 33(10): 7-15.
[9]	Jian-guo SUN,Guo-yin ZHANG,Jun-peng WU,Ai-hong YAO. Performance verification of the digital watermarking for vector map based on SPA [J]. Journal on Communications, 2010, 31(9A): 239-244.
[10]	Jian-guo SUN,Chao-guang MEN,Chun-guang MA,Cheng-ming LI. Authentication-based double images fractal watermarking model of vector maps [J]. Journal on Communications, 2009, 30(9): 24-28.
[11]	Li LIU,Dai-yuan PENG,Xiao-ju LI. Secure video watermarking scheme for broadcast monitoring [J]. Journal on Communications, 2009, 30(8): 51-55.
[12]	Hong-mei YANG,Yong-quan LIANG,Lian-shan LIU,Shu-juan JI. HVS-based imperceptibility measure of watermark in watermarked color image [J]. Journal on Communications, 2008, 29(2): 95-100.
[13]	Quan WEN,Shu-xun WANG,Gang GUO,Yu-fei WNAG. Using Duffing equation chaotic phase change to detect digital watermarking [J]. Journal on Communications, 2008, 29(11A): 144-148.
[14]	Wen-fa QI,Xiao-long LI,Bin YANG,Dao-fang CHENG. Document watermarking scheme for information tracking [J]. Journal on Communications, 2008, 29(10): 183-190.
[15]	Xiang-yang WANG,Hong-ying YANG,Pan-pan NIU. Novel blind audio watermarking algorithm in the hybrid domain [J]. Journal on Communications, 2007, 28(2): 109-114.