矩阵分解从消极反馈和未观察到的反馈中过滤出积极反馈。对于每个用户，Gorse 从其他积极反馈中随机留下一个反馈作为测试物品。期望矩阵分解模型将测试物品排在其他未观察到的物品之前。由于在评估期间为每个用户对所有物品进行排名太耗时，Gorse 遵循了通用策略^[1]，即随机抽取 100 个未与用户交互的物品，在 100 个物品中对测试物品进行排名。

矩阵分解在前 10 个推荐中进行评估。假设矩阵分解向用户 $u$ 推荐 10 个物品 $\hat I^{(10)}_u$ ，测试物品为 $i_u$

\text{NDCG@10}=\sum_{u \in U}\sum_{i=1}^{10}\frac{\mathbb{I}(i=\hat I^{(10)}_{u,i})}{\log_2(i+1)}

\text{Precision@10}=\sum_{u \in U}\sum_{i=1}^{10}\frac{\mathbb{I}(i=\hat I^{(10)}_{u,i})}{10|U|}

\text{Recall@10}=\sum_{u \in U}\sum_{i=1}^{10}\frac{\mathbb{I}(i=\hat I^{(10)}_{u,i})}{|I_u|}

其中 $\mathbb{I}(i=\hat I^{(10)}_{u,i})$ 是前 10 个推荐中的第 $i$ -th 个物品。NDCG@10 在 Gorse 中用作选择最佳模型的主要指标。

He, Xiangnan, et al. "Neural collaborative filtering." Proceedings of the 26th international conference on world wide web. 2017. ↩︎