基于三重混合采样和集成学习的潜在高价值旅客发现

POTENTIAL HIGH-VALUE PASSENGER DISCOVERY BASED ON SSOMAJ-SMOTE-SSOMIN SAMPLING AND ENSEMBLE LEARNING

  • 摘要: 考虑潜在高价值旅客特有的数据高度不平衡、旅客特征和价值类别弱相关等问题,提出一种基于三重混合采样和集成学习的潜在高价值旅客发现模型。采用RFM(Recency Frequency Monetary)方法标注旅客类别;使用三重混合采样对不平衡旅客数据集进行重采样;使用融合特征选择算法遴选旅客特征;使用梯度提升决策树作为分类器,构建旅客价值预测模型,识别潜在高价值旅客。在PNR数据集上的实验结果表明,与基准算法相比,该模型能取得更好的AUC值和F1值,可以较好地识别潜在高价值旅客。

     

    Abstract: Considering highly-imbalanced data and weak correlation between passenger characteristics and value categories of potential high-value passenger, a potential high-value passenger discovery model based on SSOMaj-SMOTE-SSOMin sampling and ensemble learning is proposed. The RFM method was used to label the passenger category. The SSOMaj-SMOTE-SSOMin method was used to resample the imbalanced passenger dataset. The fusion feature selection algorithm (FFS) was used to select the passenger features. Gradient boosting decision tree (GBDT) was taken as the classifier to build a passenger value prediction model to identify potential high-value passengers. Compared with the baseline algorithm, the experimental results on the PNR dataset show that the proposed model achieves better AUC value and F1 value, and can better identify potential high-value passengers.

     

/

返回文章
返回