多元伪泊松混合分布模型的理论研究
A Multivariate Pseudo-Poisson Mixture Distribution Model: a Theoretical Research
-
摘要: 针对混合分布模型中各项权值通常依赖于未知或已知参数而造成的模型不确定问题, 提出了一种权值基于 Frobenius 范数的混合分布模型。首先, 把多元泊松分布进行截断及均化处理, 生成伪多元泊松分布。其次, 根据有限可数混合分布的表达式, 分别求解伪多元泊松混合分布的集函数矩阵、多线性形式的 Pseudo-Boolean 函数矩阵、多线性 Pseudo-Boolean 函数矩阵的 Frobenius 范数, 由此得到新的权值并据此构建多元伪泊松混合分布模型。最后, 根据混合分布权值的归一性及非负性证明了模型的正确性并且通过仿真实验来展示构建模型的整个过程, 验证了算术平均的合理性。可为今后研究混合分布在机器学习领域的应用及算法设计提供理论基础。Abstract: The weight values in mixture distribution models usually depend on unknown or known parameters, which makes the model uncertain. To address this issue, we propose to determine the model weights based on Frobenius. Firstly, the multivariate Poisson distribution was truncated and homogenized to generate a multivariate Pseudo-Poisson distribution. Secondly, set function matrix of multivariate Pseudo-Poisson mixture distribution, multiple linear Pseudo-Boolean function matrix, multiple linear Pseudo-Boolean function matrix’s Frobenius norm were solved respectively according to the expression for countable mixture distribution. New weights were calculated and in turn a multivariate Pseudo-Poisson mixture distribution model was constructed. Finally, the correctness of the model was proved according to the normalization and nonnegativeness of the mixture distribution weights and the entire process of building model was demonstrated through simulation experiments. We also verified that arithmetic average is reasonable. The proposed model can provide a theoretical basis for applications and algorithm design of mixture distribution in machine learning.