Question

我有一个包含序列元组和目标的数据集，如下所示：

       input_0  input_1 input_2 output
0        0       1.0    2.0      4.0
1        1       2.0    4.0      2.0
2        2       4.0    2.0      4.0
3        4       2.0    4.0      7.0
4        2       4.0    7.0      8.0

我使用输出作为目标值训练算法。

我想要的是获得一个tupple可能出现的两个最可能的变量。

例如，如果我有两个用于培训的元组：a,b,c,d和a,b,c,e我希望得到d和e作为结果的相应百分比。

有可能吗？

Answer 1

从您的评论中，这似乎是一个pandas.DataFrame。假设你从

开始

from collections import Counter

df = pd.DataFrame({
    'input_0': [1, 1, 2, 4, 2], 
    'input_1': [1, 1, 2, 4, 4], 
    'input_2': [2, 2, 2, 4, 7],
    'output': [4, 3, 4, 7, 8]})
>>> df
    input_0 input_1 input_2 output
0   1   1   2   4
1   1   1   2   3
2   2   2   2   4
3   4   4   4   7
4   2   4   7   8

然后，下面将显示每个输入元组的两个最常见的元素，以及它们的计数：

>>> df.output.groupby([df.input_0, df.input_1, df.input_2]).apply(lambda s: Counter(s).most_common(2)).reset_index()
    input_0 input_1 input_2 output
0   1   1   2   [(3, 1), (4, 1)]
1   2   2   2   [(4, 1)]
2   2   4   7   [(8, 1)]
3   4   4   4   [(7, 1)]

从预测算法中获取两个目标值

1 个答案: