在Python中执行(字符串/列表)预测

时间:2013-10-28 11:10:31

标签: list python-2.7 statistics prediction

我需要使用Python执行某种类型的预测/建议。

例如,假设我们有多个列表

alphabet = ["a", "b", "c"]

other_alphabet = ["a", "b", "d"]

another_alphabet = ["a", "b", "c"]

目前正在建设中的字母

current_alphabet = ["a", "b", ...]

在3个字母中的2个字母"a", "b"之后的字母是"c",所以我希望代码预测/建议在current_alphabet "a", "b"之后的下一个字母是"c"(概率为66%)

我认为这项任务比看起来要复杂一些。

有关如何实现这一目标的任何建议吗?也许类似的东西可以帮助这个过程?

1 个答案:

答案 0 :(得分:1)

import itertools

alphabet = ["a", "b", "c"]
other_alphabet = ["a", "b", "d"]
another_alphabet = ["a", "b", "c"]

# here we take the nth char of each alphabet that is used as prediction
# (this position is indicated by the number of the currently entered char)
# zip takes the lists, and well, zips :) them, it means it creates new lists, so every
# first elements end up together, second elemends are together and so on.
# as far as current position you have to track it when user enters data (you have to
# know which (first, second, tenth) letter user is entering
current_position=1

letters = zip(alphabet,other_alphabet, another_alphabet)[current_position]
letters = list(letters)
letters.sort()
print 'letters at current position', letters

# here we group all occurences of the same letters,  
letter_groups = itertools.groupby(letters, key=lambda x: x[0])

# here we count the number of occurences of each letter
# from the alphabets, and divide it by the lenght
# of the list of letters

letter_probabilities = [[a[0], sum (1 for _ in a[1])/float(len(letters))] for a in letter_groups]
print 'letter probablilities at the current postion ', letter_probabilities

上面的代码产生以下输出:

letters at current position ['c', 'c', 'd']
letter probablilities at the current postion  [['c', 0.6666666666666666], ['d', 0.3333333333333333]]