我有testing_df
这样组织:
# Use the arrays to create a dataframe
testing_df =pd.DataFrame(test_array,columns=['transaction_id','product_id'])
# Split the product_id's for the testing data
testing_df.set_index(['transaction_id'],inplace=True)
print(testing_df.head(n=5))
transaction_id product_id
001 (P01,)
002 (P01, P02)
003 (P01, P02, P09)
004 (P01, P03)
005 (P01, P03, P05)
然后我对它进行了一些计算并创建了一个字典来存储结果:
# Initialize a dictionary to store the matches
matches = {}
# Return the product combos values that are of the appropriate length and the strings match
for transaction_id,i in enumerate (testing_df['product_id']):
recommendation = None
recommended_count = 0
for k, count in product_combos.items():
k = list(k)
if len(i)+1 == len(k) and count >= recommended_count:
for product in i:
if product in k:
k.remove(product)
if len(k) == 1:
recommendation = k[0]
recommended_count = count
matches[transaction_id] = recommendation
print(matches)
[out]
{0: 'P09', 1: 'P09', 2: 'P06', 3: 'P09', 4: 'P09', 5: 'P09'}
我遇到的问题是匹配词典的键应该是001,002,003,004,005等 - 对应于test_df的索引,即001-100。
我的第二个问题是我想填写另一个词典(推荐),键为001-100。我希望通过匹配键值将匹配值填充到此dict中。任何帮助将不胜感激,谢谢!
答案 0 :(得分:2)
因此,在使用枚举时,在for循环中,您的transaction_id将只是一个整数。当你在字典中使用它作为键时,它将显示为1而不是001.如果你真的想解决这个问题,你必须将它转换为字符串,所以不要做
matches[transaction_id]=recommendation
DO
matches[str(transaction_id).zfill(3)]=recommendation
或者你可以对你的索引做一个for循环,所以像
for ind in df.index
我不确定你的第二个问题是什么意思。要从事务ID中创建一个空字典,请执行
dict.fromkeys(list(df.index))
指出here