基于字符串匹配打印列表的二维矩阵

时间:2018-05-10 13:48:31

标签: python python-3.x pandas machine-learning

我有一个列表,我想根据所选的每个功能在网格中表达。

breakfast = [['Apple,Banana'],['Apple,Yogurt'],['Banana,Oatmeal']]

所需网格:

Index:   Apple   Banana   Yogurt   Oatmeal
1         "x"      "x"     " "       " "
2         "x"      " "     "x"       " "
3         " "      "x"     " "       "x"

我认为我需要通过网格使用列表的正则表达式和字符串索引,如何做到这一点是我的问题。更好的是,是否有一个自动执行此操作的python库(如R中的跳跃/摘要)?

这是我目前的代码:

def printMatrix(data):
    header = "Index:\tApple\tBanana\tYogurt\tOatmeal"
    print(header)
    for index, value in enumerate(data):
        if str(value).find('Apple') != -1:
            print(index,"\t\'X'", end='')
        else:
            print(index,"\t\' '",end='')
        if str(value).find('Banana') != -1:
            print("\t\'X'", end='')
        else:
            print("\t\' '",end='')
        if str(value).find('Yogurt') != -1:
            print("\t\'X'", end='')
        else:
            print("\t\' '")
        if str(value).find('Oatmeal') != -1:
            print("\t\'X'")

结果准确但效率差。

1 个答案:

答案 0 :(得分:2)

您可以使用纯pandas解决方案 - 首先创建Series,然后按str[0]和最后str.get_dummies选择列表的第一个值:

breakfast = [['Apple,Banana', 'Apple,Yogurt'],['Apple,Yogurt'],['Banana,Oatmeal']]

df = pd.Series([','.join(x) for x in breakfast]).str.get_dummies(',')
print (df)
   Apple  Banana  Oatmeal  Yogurt
0      1       1        0       1
1      1       0        0       1
2      0       1        1       0

但是如果可能的话,多个列表值解决方案首先是list comprehensionjoin先是str.get_dummies,然后是breakfast = [['Apple,Banana', 'Apple,Yogurt'],['Apple,Yogurt'],['Banana,Oatmeal']] df = pd.Series([','.join(x) for x in breakfast]).str.get_dummies(',') print (df) Apple Banana Oatmeal Yogurt 0 1 1 0 1 1 1 0 0 1 2 0 1 1 0

Post