电商集群的流量预测与不确定性区间估计
Traffic Prediction and Uncertainty Interval Estimation for E-Commerce Clusters
-
摘要: 流量预测对智能容量规划和任务调度具有重要意义, 然而大规模电商集群的流量会出现各种不确定的突发事件, 如线上促销活动、用户聚集请求等。这些不确定性事件会导致时间序列中出现很多突发脉冲, 从而给流量预测带来巨大挑战。同时, 容量预测应当对不确定性具有鲁棒性, 即能很好地应对未来可能出现的情况, 保证集群稳定性, 而并非严格地根据预测值进行容量收缩。针对大规模分布式电商集群的流量场景以及动态容量规划的需求, 该文提出了包含不确定性估计的流量实时预测框架。该框架基于多变量的长短期记忆网络自动编码器和贝叶斯理论, 在进行流量确定性预测的同时能够给出准确的不确定性区间估计。Abstract: Traffic prediction is of great significance for intelligent capacity planning and task scheduling. However, large-scale e-commerce cluster traffics have various uncertain emergencies, such as online promotion activities and user aggregation requests. These uncertain events may cause many bursts in the time series, which poses a huge challenge to traffic prediction. At the same time, capacity prediction should be robust to uncertainty. That is, it should cope well with possible future situations and ensure cluster stability, rather than shrink the capacity strictly based on the prediction. For the traffic scenarios of large-scale distributed e-commerce clusters and the requirements of dynamic capacity planning, this paper proposes a real-time load forecasting framework with uncertainty estimates. The framework is based on multivariate long short-term memory auto-encoder and Bayesian theory, which can provide accurate uncertainty interval estimation while performing flow deterministic prediction.