[1] |
孟小峰, 慈祥 . 大数据管理:概念、技术与挑战[J]. 计算机研究与发展, 2013,50(1): 146-169.
|
|
MENG X F , CI X . Big data management:concepts,techniques and challenges[J]. Journal of Computer Research and Development, 2013,50(1): 146-169.
|
[2] |
付钰, 李洪成, 吴晓平 ,等. 基于大数据分析的APT攻击检测研究综述[J]. 通信学报, 2015,36(11): 1-14.
|
|
FU Y , LI H C , WU X P ,et al. Detecting APT attacks:a survey from the perspective of big data analysis[J]. Journal on Communications, 2015,36(11): 1-14.
|
[3] |
STRANDE S M , CICOTTI P , SINKOVITS R S ,et al. Gordon:design,performance,and experiences deploying and supporting a data intensive supercomputer[C]// The 1st Conference on the Extreme Science and Engineering Discovery Environment. 2012: 1-8.
|
[4] |
杜小勇, 陈峻, 陈跃国 . 大数据探索式搜索研究[J]. 通信学报, 2015,36(12): 77-88.
|
|
DU X Y , CHEN J , CHEN Y G . Exploratory search on big data[J]. Journal on Communications, 2015,36(12): 77-88.
|
[5] |
ZAHARIA M , CHOWDHURY M , DAS T ,et al. Fast and interactive analytics over hadoop data with spark[J]. Login, 2012,37(4): 45-51.
|
[6] |
ZAHARIA M , XIN R , WENDELL P ,et al. Apache Spark:a unified engine for big data processing[J]. Communications of the ACM, 2016,59(11): 56-65.
|
[7] |
CARBONE P , EWEN S , HARIDI S ,et al. Apache flink:stream and batch processing in a single engine[J]. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 2015,36(4): 28-38.
|
[8] |
TUMMALAPALLI S , MACHAVARAPU V R . Managing mysql cluster data using cloudera impala[J]. Procedia Computer Science, 2016,85(5): 463-474.
|
[9] |
SIKKA V , LEHNER W , SANG K C ,et al. Efficient transaction processing in SAP HANA database:the end of a column store myth[C]// The 2012 ACM SIGMOD International Conference on Management of Data. 2012: 731-742.
|
[10] |
DEAN J , GHEMAWAT S . MapReduce:simplifed data processing on large clusters[C]// The Conference on Operating System Design and Implementation (OSDI). 2004: 137-150.
|
[11] |
ZAHARIA M , CHOWDHURY M , DAS T ,et al. Resilient distributed datasets:a fault-tolerant abstraction for in-memory cluster computing[C]// The 9th USENIX Conference on Networked Systems Design and Implementation. 2012:2.
|
[12] |
LIN X , WANG P , WU B . Log analysis in cloud computing environment with hadoop and spark[C]// The 5th IEEE International Conference on Broadband Network & Multimedia Technology (IC-BNMT). 2013: 273-276.
|
[13] |
DONG X , XIE Y , MURALIMANOHAR N ,et al. Hybrid checkpointing using emerging nonvolatile memories for future exascale system[J]. ACM Transactions on Architecture and Code Optimization (TACO), 2011,8(2): 1-29.
|
[14] |
田俊峰, 张亚姣 . 基于马尔可夫的检查点可信评估方法[J]. 通信学报, 2015,36(1): 234-240.
|
|
TIAN J F , ZHANG Y J . Checkpoint trust evaluation method based on Markov[J]. Journal on Communications, 2015,36(1): 234-240.
|
[15] |
ARMBRUST M , XIN R S , LIAN C ,et al. Spark SQL:relational data processing in spark[C]// The 2015 ACM SIGMOD International Conference on Management of Data. 2015: 1383-1394.
|
[16] |
IQBAL M H , SOOMRO T R . Big data analysis:apache storm perspective[J]. International Journal of Computer Trends & Technology, 2015,19(1): 9-14.
|
[17] |
ZAHARIA M , DAS T , LI H Y ,et al. Discretized streams:fault-tolerant streaming computation at scale[C]// ACM Symposium on Operating Systems Principles. 2013: 423-438.
|
[18] |
MENG X , BRADLEY J , YAVUZ B ,et al. MLlib:machine learning in apache Spark[J]. Journal of Machine Learning Research, 2015,17(1): 1235-1241.
|
[19] |
GONZALEZ J E , XIN R S , DAVE A ,et al. GraphX:graph processing in a distributed dataflow framework[C]// The 11th USENIX conference on Operating Systems Design and Implementation. 2014: 599-613.
|
[20] |
廖彬, 于炯, 孙华 ,等. 基于存储结构重配置的分布式存储系统节能算法[J]. 计算机研究与发展, 2013,50(1): 3-18.
|
|
LIAO B , YU J , SUN H ,et al. Energy-efficient algorithms for distributed storage system based on data storage structure reconfiguration[J]. Journal of Computer Research and Development, 2013,50(1): 3-18.
|
[21] |
DEAN J , GHEMAWAT S . MapReduce:simplified data processing on large clusters[J]. Operating Systems Design & Implementation, 2004,5(1): 147-152.
|
[22] |
KWON Y , BALAZINSKA M , HOWE B ,et al. A study of skew in MapReduce application[J]. Open Cirrus Summit, 2011,1: 1-5.
|
[23] |
KWON Y , BALAZINSKA M , HOWE B ,et al. Skew-resistant parallel processing of feature-extracting scientific user-defined functions[C]// The 1st ACM Symposium on Cloud Computing. 2010: 75-86.
|
[24] |
王卓, 陈群, 李战怀 ,等. 基于增量式分区策略的MapReduce数据均衡方法[J]. 计算机学报, 2016,39(1): 19-35.
|
|
WANG Z , CHEN Q , LI Z H ,et al. An incremental partitioning strategy for data balance on MapReduce[J]. Chinese Journal of Computers, 2016,39(1): 19-35.
|
[25] |
KWON Y , BALAZINSKA M , HOWE B ,et al. SkewTune:mitigating skew in MapReduce applications[C]// The 2012 ACM SIGMOD International Conference on Management of Data. 2012: 25-36.
|
[26] |
YAN W , XUE Y , MALIN B . Scalable and robust key group size estimation for reducer load balancing in MapReduce[C]// IEEE Int Conference on Big Data. 2013: 156-162.
|
[27] |
RAMAKRISHNAN S R , SWART G , URMANOV A ,et al. Balancing reducer skew in MapReduce workloads using progressive sampling[C]// The 3rd ACM Symposium on Cloud Computing (SOCC’12). 2012: 1-14.
|
[28] |
GUFLER B , AUGSTEN N , REISER A ,et al. Handing data skew in MapReduce[C]// The 1st International Conference on Cloud Computing and Services Science. 2011: 574-583.
|
[29] |
GUFLER B , AUGSTEN N , REISER A ,et al. Load balancing in MapReduce based on scalable cardinality estimates[C]// The 28th IEEE International Conference on Data Engineering (ICDE). 2012: 522-533.
|
[30] |
TANG Z , ZHANG X S , LI K ,et al. An intermediate data placement algorithm for load balancing in Spark computing[J]. Future Generation Computer Systems, 2016.
|
[31] |
KOLB L , THOR A , RAHM E . Load balancing for MapReduce-based entity resolution[C]// The 28th IEEE International Conference on Big Data Engineering (ICDE). 2012: 618-629.
|
[32] |
KOLB L , THOR A , RAHM E ,et al. Block-based load balancing for entity resolution with MapReduce[C]// The 20th ACM International Conference on Information and Knowledge Management (CIKM). 2011: 2397-2400.
|
[33] |
CHEN Q , YAO J Y , XIAO Z . Libra:lightweight data skew mitigation in MapReduce[J]. IEEE Transactions on Parallel & Distributed Systems, 2015,26(9): 2520-2533.
|
[34] |
RACHA S C . Load balancing MapReduce communications for efficient executions of applications in a cloud[M]. India,Bangalore: Indian Institute of Science, 2012: 12-16.
|
[35] |
IBRAHIM S , JIN H , LU L ,et al. Handling partitioning skew in MapReduce using LEEN[J]. Peer-to-Peer Networking and Applications, 2013,6(4): 409-424.
|
[36] |
DAI W , IBRAHIM I , BASSIOUNI M . Improving load balance for data-intensive computing on cloud platforms[C]// 2016 IEEE International Conference on Smart Cloud. 2016: 140-145.
|
[37] |
TANG Z , ZHANG X S , LI K L ,et al. A data skew oriented reduce placement algorithm based on sampling[J]. IEEE Transactions on Cloud Computing, 2016.
|
[38] |
FAN Y Q , WU W G , XU Y L ,et al. Improving MapReduce performance by balancing skewed loads[J]. Communications, 2014,11(8): 85-108.
|
[39] |
TRIGUERO I , GALAR M , VLUYMANS S . Evolutionary undersampling for extremely imbalanced big data classification under apache spark[C]// 2016 IEEE Congress on Evolutionary Computation. 2016: 715-722.
|
[40] |
MESTRE D G , PIRES C E , NASCIMENTO D C ,et al. An efficient spark-based adaptive windowing for entity matching[J]. Journal of Systems and Software, 2017,128(6): 1-10.
|
[41] |
GHODSI A , ZAHARIA M , SHENKER S ,et al. Choosy:max-min fair sharing for datacenter jobs with constraints[C]// The 8th ACM European Conference on Computer Systems. 2013: 365-378.
|