基于注意力ConvLSTM模型的人脸图像年龄估计研究

AGE ESTIMATION OF FACE IMAGE BASED ON ATTENTION CONVLSTM MODEL

  • 摘要: 针对人脸图像在空间和时间序列上对细粒度特征方面年龄估计准确性不高的问题,提出一种基于注意力机制的卷积长短时记忆网络模型(Convolution Long-Short Term Memory, ConvLSTM )。在 ConvLSTM模型的上一层隐藏状态和当前输入状态之间引入注意力机制对年龄估计产生显著影响的特征因子增加权重;通过平均池化得到通道权重因子,并对注意力权重进行归一化操作,得到新的输入状态;利用新的输入状态通过ConvLSTM模型实现特征提取和年龄估计。为验证模型的有效性,以FG-NET和MORPH人脸数据集为实验对象,以平均绝对误差(Mean Absolute Error,MAE)和累积指数(Cumulative Score,CS)为评价指标。实验结果表明,算法模型在FG-NET和MORPH人脸数据集上平均绝对误差分别为3.60和2.45;在MORPH数据集上累积指数达到89.3%;与非注意力ConvLSTM模型和LSTM模型相比其累积指数平均提高0.80百分点和4.60百分点;在算法模型复杂度方面也具有良好表现。

     

    Abstract: Aiming at the problem that the age estimation accuracy of face images in terms of spatial and time series fine-grained features is not high, this paper proposes a convolution long-short term memory (ConvLSTM) network model based on the attention mechanism. We introduced an attention mechanism between the upper hidden state of the ConvLSTM model and the current input state to increase the weight of feature factors that had a significant impact on age estimation. The channel weight factor was obtained by averaging pooling, and the attention weight was normalized to obtain a new input state. The feature extraction and age estimation were realized by the new input state and ConvLSTM model. In order to verify the effectiveness of the model, FG-NET and MORPH face datasets were used as experimental objects, with mean absolute error (MAE) and cumulative score (CS) as evaluation indicators. The experimental results show that the average absolute errors of the algorithm model on the FG-NET and MORPH face data sets are 3.60 and 2.45, respectively; the cumulative index reached 89.3% on the MORPH data set. Compared with the non-attention ConvLSTM model and the LSTM model, the cumulative index is increased by 0.80 and 4.60 percentage points. It also has a good performance in the complexity of the algorithm model.

     

/

返回文章
返回