[1] |
陈游旻, 李飞, 舒继武 . 大数据环境下的存储系统构建:挑战、方法和趋势[J]. 大数据, 2019,5(4): 27-40.
|
|
CHEN Y M , LI F , SHU J W . Building storage systems in big data era:challenges,methods and trends[J]. Big Data Research, 2019,5(4): 27-40.
|
[2] |
HE K M , ZHANG X Y , REN S Q ,et al. Deep residual learning for image recognition[C]// The 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016: 770-778.
|
[3] |
IOFFE S , SZEGEDY C . Batch normalization:accelerating deep network training by reducing internal covariate shift[C]// The 32nd International Conference on Machine Learning. New York:ACM Press, 2015: 448-456.
|
[4] |
KRIZHEVSKY A , SUTSKEVER I , HINTON G . ImageNet classification with deep convolutional neural networks[C]// The 26th Annual Conference on Neural Information Processing Systems. Cambridge:MIT Press, 2012: 1106-1114.
|
[5] |
HOCHREITER S , SCHMIDHUBER J . Long short-term memory[J]. Neural Computation, 1997,9(8): 1735-1780.
|
[6] |
SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for largescale image recognition[C]// The 3rd International Conference on Learning Representations.[S.l.:s.n]. 2014
|
[7] |
SZEGEDY C , LIU W , JIA Y Q ,et al. Going deeper with convolutions[C]// The 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2015: 1-9.
|
[8] |
ZAGORUYKO S , KOMODAKIS N . Wide residual networks[J]. Computer Science, 2016,arXiv:1605.07146.
|
[9] |
DEVLIN J , CHANG M-W , LEE K ,et al. BERT:pretraining of deep bidirectional transformers for language understanding[J]. Computer Science, 2018,arXiv:1810.04805.
|
[10] |
王孝远, 廖小飞, 刘海坤 ,等. 面向大数据的异构内存系统[J]. 大数据, 2018,4(4): 15-34.
|
|
WANG X Y , LIAO X F , LIU H K ,et al. Big data oriented hybrid memory systems[J]. Big Data Research, 2018,4(4): 15-34.
|
[11] |
李鑫, 陈璇, 黄志球 . 面向大数据应用的混合内存架构特征分析[J]. 大数据, 2018,4(3): 61-80.
|
|
LI X , CHEN X , HUANG Z Q . Analysis on hybrid memory architecture for big data application[J]. Big Data Research, 2018,4(3): 61-80.
|
[12] |
LECUN Y , BOTTOU L , BENGIO Y ,et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998,86: 2278-2324.
|
[13] |
ABADI M , BARHAM P , CHEN J M ,et al. TensorFlow:a system for large-scale machine learning[C]// The 12th USENIX Symposium on Operating Systems Design and Implementation. Berkeley:USENIX Association, 2016: 265-283.
|
[14] |
BERGSTRA J , BREULEUX O , FREDERIC B ,et al. Theano:a CPU and GPU math compiler in Python[C]// The 9th Python for Scientific Computing Conference.[S.l.:s.n]. 2010: 1-7.
|
[15] |
PASZKE A , GROSS S , MASSA F ,et al. PyTorch:an imperative style,highperformance deep learning library[C]// The 2019 Annual Conference on Neural Information Processing Systems. Cambridge:MIT Press, 2019: 8024-8035.
|
[16] |
CHEN T Q , LI M , LI Y T ,et al. MXNet:a flexible and efficient machine learning library for heterogeneous distributed systems[J]. Computer Science, 2015,arXiv:1512.01274.
|
[17] |
DEAN J , CORRADO G , MONGA R ,et al. Large scale distributed deep networks[C]// The 26th Annual Conference on Neural Information Processing Systems. Cambridge:MIT Press, 2012: 1232-1240.
|
[18] |
RHU M , GIMELSHEIN N , CLEMONS J ,et al. vDNN:virtualized deep neural networks for scalable,memoryefficient neural network design[C]// The 49th Annual IEEE/ACM International Symposium on Microarchitecture. Piscataway:IEEE Press, 2016: 1-13.
|
[19] |
CHEN M , SUN M M , YANG J ,et al. Training deeper models by GPU memory optimization on TensorFlow[C]// Advances in Neural Information Processing Systems 30.[S.l.:s.n]. 2017.
|
[20] |
CHEN X M , CHEN D Z , HAN Y H ,et al. moDNN:memory optimal deep neural network training on graphics processing units[J]. IEEE Transactions on Parallel and Distributed Systems, 2019,30(3): 646-661.
|
[21] |
JIN H , LIU B , JIANG W B ,et al. Layercentric memory reuse and data migration for extreme-scale deep learning on many-core architectures[J]. ACM Transactions on Architecture and Code Optimization, 2018,15(3): 1-26.
|
[22] |
WANG L N , YE J M , ZHAO Y Y ,et al. Superneurons:dynamic GPU memory management for training deep neural networks[C]// The 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York:ACM Press, 2018: 41-53.
|
[23] |
PENG X , SHI X H , DAI H L ,et al. Capuchin:tensor-based GPU memory management for deep learning[C]// The 24th International Conference on Architectural Support for Programming Languages and Operating Systems. New York:ACM Press, 2020: 891-905.
|
[24] |
CHEN T Q , XU B , ZHANG C Y ,et al. Training deep nets with sublinear memory cost[J]. Computer Science, 2016,arXiv:1604.06174.
|
[25] |
GRUSLYS A , MUNOS R , DANIHELKA I ,et al. Memory-efficient backpropagation through time[C]// The 2016 Annual Conference on Neural Information Processing Systems. Cambridge:MIT Press, 2016: 4125-4133.
|
[26] |
KUSUMOTO M , INOUE T , WATANABE G ,et al. A graph theoretic framework of recomputation algorithms for memoryefficient backpropagation[C]// The 2019 Annual Conference on Neural Information Processing Systems. Cambridge:MIT Press, 2019: 1161-1170.
|
[27] |
JAIN P , JAIN A , NRUSIMHA A ,et al. Checkmate:breaking the memory wall with optimal tensor rematerialization[C]// Machine Learning and Systems 2020.[S.l.:s.n]. 2020: 497-511.
|
[28] |
RHU M,O’CONNOR M , CHATTERJEE N ,et al. Compressing DMA engine:leveraging activation sparsity for training deep neural networks[C]// The IEEE 24th International Symposium on High Performance Computer Architecture. Piscataway:IEEE Press, 2018: 78-91.
|
[29] |
JAIN A , PHANISHAYEE A , MARS J ,et al. Gist:efficient data encoding for deep neural network training[C]// The 45th ACM/IEEE Annual International Symposium on Computer Architecture. Piscataway:IEEE Press, 2018: 776-789.
|
[30] |
HAN S , MAO H Z , DALLY W . Deep compression:compressing deep neural network with pruning,trained quantization and Huffman coding[C]// The 4th International Conference on Learning Representations.[S.l.:s.n]. 2016.
|