Question

我有testing_df这样组织：

# Use the arrays to create a dataframe testing_df =pd.DataFrame(test_array,columns=['transaction_id','product_id'])

# Split the product_id's for the testing data testing_df.set_index(['transaction_id'],inplace=True) print(testing_df.head(n=5))

transaction_id product_id 001 (P01,) 002 (P01, P02) 003 (P01, P02, P09) 004 (P01, P03) 005 (P01, P03, P05)

然后我对它进行了一些计算并创建了一个字典来存储结果：

# Initialize a dictionary to store the matches matches = {}

# Return the product combos values that are of the appropriate length and the strings match for transaction_id,i in enumerate (testing_df['product_id']): recommendation = None recommended_count = 0

for k, count in product_combos.items():
    k = list(k)
    if len(i)+1 == len(k) and count >= recommended_count:
        for product in i:
            if product in k: 
                k.remove(product)
        if len(k) == 1:
            recommendation = k[0]
            recommended_count = count
matches[transaction_id] = recommendation

print(matches) [out] {0: 'P09', 1: 'P09', 2: 'P06', 3: 'P09', 4: 'P09', 5: 'P09'}

我遇到的问题是匹配词典的键应该是001,002,003,004,005等 - 对应于test_df的索引，即001-100。

我的第二个问题是我想填写另一个词典（推荐），键为001-100。我希望通过匹配键值将匹配值填充到此dict中。任何帮助将不胜感激，谢谢！

Answer 1

因此，在使用枚举时，在for循环中，您的transaction_id将只是一个整数。当你在字典中使用它作为键时，它将显示为1而不是001.如果你真的想解决这个问题，你必须将它转换为字符串，所以不要做

matches[transaction_id]=recommendation

DO

matches[str(transaction_id).zfill(3)]=recommendation

或者你可以对你的索引做一个for循环，所以像

for ind in df.index

我不确定你的第二个问题是什么意思。要从事务ID中创建一个空字典，请执行

dict.fromkeys(list(df.index))

指出here

将字典的键匹配到数据帧的索引的问题

1 个答案: