我有一个用字符串描述的操作列表,ex" BPxPyPzPC
"其中每个字母代表一个动作,C代表一个事件。
有些用户'行动(例如" B
"," Px
"," Py
"和" Pz
")导致了一个事件(在我的例子中,字母" C
"),其他人没有,所以我想确定行动的模式( ex" BPxPyPz
")最经常导致事件,在Python中执行此操作的最有效方法是什么?
谢谢!
示例代码:
c=['' for x in range(0,4)]
c[0]="BPxPxPyPC"
c[1]="BPxPyPyPC"
c[2]="BPyPxPyPC"
c[3]="BPyPxPyPC"
#do something
#desired result
The most likely sequence of actions to achieve "C" is "BPyPxPy"
答案 0 :(得分:3)
目前尚不清楚您是否以及如何区分这些行为。
我使用正则表达式来匹配C
后面的任何字符串,以及Counter
来获取最常见的字符串。
这是获得结果最简单的方法:
import re
from collections import Counter
c = ["BPxPxPyPC", "BPxPyPyPC", "BPyPxPyPC", "BPyPxPyPC"]
cnt = Counter()
for sequence in c:
m = re.match('^(.*)C$', sequence)
if m: cnt.update([m.group(1)])
print('The most likely sequence is " {}"'.format(cnt.most_common(1)[0][0]))
# BPyPxPyP
答案 1 :(得分:0)
我会做这样的事情(仅考虑事件 ev 堆栈 中的最大点击次数的第一个操作序列> 强>):
def checkSeq(c, stack):
stackSeqs = [x[0] for x in stack]
if c not in stackSeqs:
stack.append([c,0])
else:
idx = stackSeqs.index(c)
stack[idx][1] += 1
return stack
def max_act_ev(ev, stack):
acts=[]
for row in stack:
if ev in row[0][-1]:
acts.append(row)
if len(acts) > 0:
res = sorted(acts, key=lambda x: x[1],reverse=True)
return res[0]
else:
return []
# Start of code
stack=[["BPyPxPyPC",1],["BPxPxPyPC",1],["BPxPxPxPC",1]]
c = "BPxPxPyPC"
ev = "C"
stack = checkSeq(c,stack)
seq = max_act_ev(ev,stack)
print(stack)
if len(seq)>0:
print('The most likely sequence of actions to achieve "'+seq[0][-1]+'" is "'+seq[0][:-1]+'"')
else:
print("No actions found for event "+ev)