[1] |
HAO Y , QUIGLEY S . The implementation of a deep recurrent neural network language model on a Xilinx FPGA[J]. arXiv Preprint arXiv:1710.10296, 2017.
|
[2] |
SAK H , SENIOR A , BEAUFAYS F . Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition[J]. arXiv Preprint arXiv:1402.1128, 2014.
|
[3] |
MIKOLOV T , KARAFIAT M , BURGET L ,et al. Recurrent neural network based language model[C]// Eleventh Annual Conference of the International Speech Communication Association. 2010.
|
[4] |
CHO K , VAN -MERRIENBOER B , GULCEHRE C ,et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. arXiv Preprint arXiv:1406.1078, 2014.
|
[5] |
GRAVES A , MOHAMED A , HINTON G . Speech recognition with deep recurrent neural networks[C]// 2013 IEEE International Conference on.Acoustics,speech and signal processing (icassp). 2013: 6645-6649.
|
[6] |
BYEONW , BREUEL T M , RAUE F , et al . Scene labeling with LSTM recurrent neural networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3547-3555.
|
[7] |
ZHANG Y , WANG C , GONG L ,et al. A power-efficient accelerator based on FPGA for LSTM network[C]// 2017 IEEE International Conference on Cluster Computing (CLUSTER). 2017: 629-630.
|
[8] |
GUO K , ZENG S , YU J ,et al. A survey of FPGA-based neural network accelerator[J]. arXiv preprint arXiv:1712.08934, 2017.
|
[9] |
HWANG K , SUNG W . Single stream parallelization of generalized LSTM-like RNNs on a GPU[J]. arXiv Preprint arXiv:1503.02852, 2015.
|
[10] |
ABADI M , AGARWAL A , BARHAM P ,et al. Tensorflow:largescale machine learning on heterogeneous distributed systems[J]. arXiv preprint arXiv:1603.04467, 2016.
|
[11] |
OUYANG P , YIN S , WEI S . A fast and power efficient architecture to parallelize LSTM based RNN for cognitive intelligence applications[C]// The 54th Annual Design Automation Conference 2017. ACM, 2017:63.
|
[12] |
NURVITADHI E , SIM J , SHEFFIELD D ,et al. Accelerating recurrent neural networks in analytics servers:comparison of FPGA,CPU,GPU,and ASIC[C]// 2016 26th International Conference on Field Programmable Logic and Applications (FPL). 2016: 1-4.
|
[13] |
HOPFIELD J J . Neural networks and physical systems with emergent collective computational abilities[J]. Proceedings of the National Academy of Sciences, 1982,79(8): 2554-2558.
|
[14] |
HOCHREITER S , SCHMIDHUBER J . Long short-term memory[J]. Neural Computation, 1997,9(8): 1735-1780.
|
[15] |
CHUNG J , GULCEHRE C , CHO K H ,et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[J]. arXiv Preprint arXiv:1412.3555, 2014.
|
[16] |
ZAREMBA W , SUTSKEVER I , VINYALS O . Recurrent neural network regularization[J]. arXiv Preprint arXiv:1409.2329, 2014.
|
[17] |
RYBALKIN V , PAPPALARDO A , GHAFFAR M M ,et al. FINN-L:library extensions and design trade-off analysis for variable precision LSTM networks on FPGAs[J]. arXiv Preprint arXiv:1807.04093, 2018.
|
[18] |
RYBALKIN V , WEHN N , YOUSEFI M R ,et al. Hardware architecture of bidirectional long short-term memory neural network for optical character recognition[C]// The Conference on Design,Automation & Test in Europe.European Design and Automation Association. 2017: 1394-1399.
|
[19] |
GUAN Y , LIANG H , XU N ,et al. FP-DNN:an automated framework for mapping deep neural networks onto FPGAs with RTL-HLS hybrid templates[C]// 2017 IEEE 25th Annual International Symposium on Field-programmable Custom Computing Machines (FCCM). 2017: 152-159.
|
[20] |
LI S , LI W , COOK C ,et al. Independently recurrent neural network (indrnn):building a longer and deeper RNN[C]// IEEE Conference on Computer Vision and Pattern Recognition. 2018: 5457-5466.
|
[21] |
HAJDUK Z . Reconfigurable FPGA implementation of neural networks[J]. Neuro Computing, 2018,308: 227-234.
|
[22] |
LIU B , DONG W , XU T ,et al. E-ERA:an energy-efficient reconfigurable architecture for RNN using dynamically adaptive approximate computing[J]. IEICE Electronics Express, 2017,14(15): 20170637-20170637.
|
[23] |
宋翔, 周凡, 陈耀武 ,等. 基于 FPGA 的实时双精度浮点矩阵乘法器设计[J]. 浙江大学学报(工学版), 2008,42(9): 1611-1615.
|
|
SONG X , ZHOU F , CHEN Y W ,et al. Design of real time double precision floating point matrix multiplier based on FPGA[J]. Journal of ZheJiang University, 2008,42(9): 1611-1615.
|
[24] |
GUAN Y , YUAN Z , SUN G ,et al. FPGA-based accelerator for long short-term memory recurrent neural networks[C]// IEEE Design Automation Conference (ASP-DAC). 2017: 629-634.
|
[25] |
CHANG A X M , CULURCIELLO E . Hardware accelerators for recurrent neural networks on FPGA[C]// 2017 IEEE International Symposium on.Circuits and Systems (ISCAS). 2017: 1-4.
|
[26] |
CHANG A X M , MARTINI B , CULURCIELLO E . Recurrent neural networks hardware implementation on FPGA[J]. arXiv Preprint arXiv:1511.05552, 2015.
|
[27] |
LI S , WU C , LI H ,et al. Fpga acceleration of recurrent neural network based language model[C]// 2015 IEEE 23rd Annual International Symposium on Field-programmable Custom Computing Machines. IEEE, 2015: 111-118.
|
[28] |
LEE M , HWANG K , PARK J ,et al. FPGA-based low-powerspeech recognition with recurrent neural networks[C]// 2016 IEEE International Workshop on.Signal Processing Systems (SiPS). 2016: 230-235.
|
[29] |
WANG S , LI Z , DING C ,et al. C-LSTM:enabling efficient LSTM using structured compression techniques on FPGAs[C]// ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 2018: 11-20.
|
[30] |
ZHANG Y , WANG C , GONG L ,et al. Implementation and optimization of the accelerator based on FPGA hardware for LSTM network[C]// IEEE International Symposium on Parallel and Distributed Processing with Applications and IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC). 2017: 614-621.
|
[31] |
LIAO Y , LI H , WANG Z . Based real-time processing architecture for recurrent neural network[C]// International Conference on Intelligent and Interactive Systems and Applications. 2017: 705-709.
|
[32] |
SALCIC Z , BERBER S , SECKER P . FPGA prototyping of RNN decoder for convolutional codes[J]. EURASIP Journal on Advances in Signal Processing, 2006,2006(1):015640.
|
[33] |
FERREIRA J C , FONSECA J . An FPGA implementation of a long short-term memory neural network[C]// 2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig). 2016: 1-8.
|
[34] |
SHIN S , HWANG K , SUNG W . Fixed-point performance analysis of recurrent neural networks[J]. arXiv Preprint arXiv:1512.01322, 2015.
|
[35] |
HAN S , POOL J , TRAN J ,et al. Learning both weights and connections for efficient neural network[C]// Advances in Neural Information Processing Systems. 2015: 1135-1143.
|
[36] |
HAN S , KANG J , MAO H ,et al. Ese:efficient speech recognition engine with sparse LSTM on FPGA[C]// ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 2017: 75-84.
|
[37] |
ALI S M , SHAOJUN W , NING M ,et al. A bandwidth in-sensitive low stall sparse matrix vector multiplication architecture on reconfigurable FPGA platform[C]// 13th IEEE International Conference on Electronic Measurement & Instruments (ICEMI). 2017: 171-176.
|
[38] |
FOWERS J , OVTCHAROV K , STRAUSS K ,et al. A high memory bandwidth FPGA accelerator for sparse matrix-vector multiplication[C]// IEEE 22nd Annual International Symposium on FieldProgrammable Custom Computing Machines (FCCM). 2014: 36-43.
|
[39] |
NEIL D , LEE J H , DELBRUCK T ,et al. Delta networks for optimized recurrent network computation[J]. arXiv Preprint arXiv:1612.05571, 2016.
|
[40] |
GAO C , NEIL D , CEOLINI E ,et al. DeltaRNN:a power-efficient recurrent neural network accelerator[C]// ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 2018: 21-30.
|
[41] |
KINGSBURY B E D , SAINATH T N , SINDHWANI V . Low-rank matrix factorization for deep belief network training with high-dimensional output targets[P].2016-2-16.
|
[42] |
XUE J , LI J , GONG Y . Restructuring of deep neural network acoustic models with singular value decomposition[C]// Interspeech. 2013: 2365-2369.
|
[43] |
QIU J , WANG J , YAO S ,et al. Going deeper with embedded FPGA platform for convolutional neural network[C]// ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 2016: 26-35.
|
[44] |
LU Z , SINDHWANI V , SAINATH T N . Learning compact recurrent neural networks[J]. arXiv Preprint arXiv:1604.02594, 2016.
|
[45] |
RIZAKIS M , VENIERIS S I , KOURIS A ,et al. Approximate FPGA-based LSTM under computation time constraints[J]. arXiv Preprint arXiv:1801.02190, 2018.
|
[46] |
LI Z , WANG S , DING C ,et al. Efficient recurrent neural networks using structured matrices in FPGA[J]. arXiv Preprint arXiv:1803.07661, 2018.
|
[47] |
WANG Z , LIN J , WANG Z . Accelerating recurrent neural networks:a memory-efficient approach[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2017,25(10): 2763-2775.
|