我正在尝试从头开始构建马尔可夫链用户转换矩阵,但却被卡在字典值赋值中。以下是示例代码
## user purchase sequence seperated by '|' at different time intervals
## lets say in first purchase user bought 3 4 12 23 45 41 25 these products then 4 5 12 17 19 25 46 3 and so on
user_purchase = '3 4 12 23 45 41 25|4 5 12 17 19 25 46 3|39 12 3 23 50 24 35 13|42 34 17 19 46'
## I need to find the transition count from first purchase to second and so on
## e.g 3-1 is 0 , 3-2 is 0 , 3-3 is 0 , 3-4 is 1
## hence output should be {...,2:[(0,0),(0,0),.....], 3:[(0,1),(0,1),(0,1),(1,1), ...], 4:[...]} its a dictionary of list with tuples
### lets say its the total no of products ranging from 1 to 50 that user can buy
prod = range(1,51)
### initializing a dictionary of list with tuples
t = (0,0)
list1= []
for _ in range(len(prod)):
list1.append(t)
user_tran = {}
for p in prod:
user_tran[p]= list1
# def trans_matrix(prod_seq):
basket_seq = user_purchase.split('|')
iteration = len(basket_seq)
for j in range(iteration-1):
trans_from = basket_seq[j]
trans_to = basket_seq[j+1]
tfrom = map(int,trans_from.split(' '))
print tfrom
tto = map(int,trans_to.split(' '))
for item in tfrom:
### problem here is in each iteration the default value for all keys is updated from [(0,0),(0,0),....] to item_list
item_list = user_tran[item] ### seems problem here
for i in range(len(prod)):
if i+1 in tto:
temp = item_list[i]
x = list(temp)
x[0] = x[0] +1
x[1] = x[1] +1
item_list[i] = tuple(x)
else:
temp = item_list[i]
x = list(temp)
x[0] = x[0]
x[1] = x[1] + 1
item_list[i] = tuple(x)
user_tran[item] = item_list ### list updation should only be for item specified as key in user_tran but all keys are updated with same value
期望的输出user_tran [3] [1:5]
出[38]:[(0,23),(15,23),(7,23),(7,23)]
0在不同时间的3个购买序列中从3过渡到1,2并且产品3在前三个购买序列中存在。 但是从3-3
有两个过渡[(0,3),(0,3),(2,3),......直到产品50]
答案 0 :(得分:0)
我没有找到原因,但我尝试使用numpy数组实现它,没有任何元组和字典。
我的输出与您预期的输出不同,但我完全按照您的目标使用字典。它只是字典列表版本到numpy数组版本的翻译。可能它会帮助你。
import numpy as np
user_purchase = '3 4 12 23 45 41 25|4 5 12 17 19 25 46 3|39 12 3 23 50 24 35 13|42 34 17 19 46'
prod = range(0, 50)
user_tran = np.zeros((50,50,2))
basket_seq = user_purchase.split('|')
iteration = len(basket_seq)
for j in range(iteration-1):
trans_from = basket_seq[j]
trans_to = basket_seq[j+1]
tfrom = map(int,trans_from.split(' '))
tfrom = [x-1 for x in tfrom]
tto = map(int,trans_to.split(' '))
tto = [x - 1 for x in tto]
for item in tfrom:
item_list = user_tran[item, :, :]
for i in range(len(prod)):
if i + 1 in tto:
temp = item_list[i, :]
item_list[i, :] = np.array([temp[0] + 1, temp[1] + 1])
else:
temp = item_list[i, :]
item_list[i, :] = np.array([temp[0], temp[0] + 1])
user_tran[item, :, :] = item_list
print user_tran[2, 1:5, :]
user_tran表单如下:NxMx2其中N是字典版本中的键数,M是商店中的项目数,2是而不是具有2个值的元组。例如:要获得字典中的第3个键,以及列表中的第1个到第4个项,您必须编写
user_tran[2, 1:5, :] #instead of user_tran[3][1:5]
因为数组以0索引开头但不是1。
你将获得4x2矩阵,其中4是列表中的元素数量,2是元组的2个值。