LightFM用户/项目产生Nan嵌入

时间:2016-12-05 04:42:06

标签: python machine-learning matrix-factorization

我正在尝试使用python中的LightFM库创建冷启动建议。 https://github.com/lyst/lightfm

这适用于协同过滤,没有用户和项目功能,即:

from lightfm import LightFM
interaction_matrix

<322139x42715 sparse matrix of type '<type 'numpy.float32'>'
    with 4571208 stored elements in COOrdinate format>

model = LightFM(no_components=50)
model.fit(interaction_matrix, epochs=1, num_threads=32)
predictions = model.predict(12, np.arange(250), num_threads=32)

这使得预测很好。但是,当我添加:

members_features, item_features

(<322139x2790 sparse matrix of type '<type 'numpy.float32'>'
    with 19840665 stored elements in Compressed Sparse Row format>,
 <42715x2790 sparse matrix of type '<type 'numpy.float32'>'
    with 355006 stored elements in Compressed Sparse Row format>)

model2 = LightFM(no_components=100, loss='warp', item_alpha=0.001, user_alpha=0.001)

model2.fit(interaction_matrix, user_features=members_features, item_features=item_features, sample_weight=None, \
                  verbose=True, epochs=2, num_threads=32)

我为用户和项目嵌入获得了Nan's。

model2.item_embeddings

array([[ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       ..., 
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan]], dtype=float32)

1 个答案:

答案 0 :(得分:0)

您应该尝试更新到LightFM 1.12(通过pip install lightfm==1.12)。此版本修复了许多可能导致您看到的结果的数值不稳定问题。

如果您对血腥细节感兴趣,可以查看Github issue