熊猫将一列中的列表项与另一列中的单个值进行比较

时间:2020-07-23 15:14:56

标签: python pandas apply

考虑这两列df。我想创建一个Apply函数,将“ other_yrs”列列表中的每个项目与“ cur”列中的单个整数进行比较,并在“ other_yrs”列列表中将每个项目的计数保持为大于或等于“ cur”列中的单个值。我无法弄清楚如何使大熊猫能够通过apply做到这一点。我将Apply函数用于其他目的,并且它们运行良好。任何想法将不胜感激。

    cur other_yrs
1   11  [11, 11]
2   12  [16, 13, 12, 9, 9, 6, 6, 3, 3, 3, 2, 1, 0]
4   16  [15, 85]
5   17  [17, 17, 16]
6   13  [8, 8]

下面是我用来将值提取到“ other_yrs”列中的函数。我在想我可以以某种方式将列表中的每个连续值与“ cur”列值进行比较并保持计数。我真的只需要存储多少列表项的计数<=“ cur”列中的值。

def col_check(col_string):
cs_yr_lst = []
count = 0
if len(col_string) < 1:  #avoids col values of 0 meaning no other cases.
    pass
else:
    case_lst = col_string.split(", ")  #splits the string of cases into a list
    for i in case_lst:
        cs_yr = int(i[3:5])  #gets the case year from each individual case number
        cs_yr_lst.append(cs_yr)  #stores those integers in a list and then into a new column using apply
return cs_yr_lst

预期输出为:

  cur other_yrs    count
1   11  [11, 11]     2
2   12  [16, 13, 12, 9, 9, 6, 6, 3, 3, 3, 2, 1, 0]   11
4   16  [15, 85]     1
5   17  [17, 17, 16] 3
6   13  [8, 8]  2

2 个答案:

答案 0 :(得分:3)

在列表理解内使用zip压缩curother_yrs列,并在布尔掩码上使用np.sum

df['count'] = [np.sum(np.array(b) <= a) for a, b in zip(df['cur'], df['other_yrs'])]

另一个想法:

df['count'] = pd.DataFrame(df['other_yrs'].tolist(), index=df.index).le(df['cur'], axis=0).sum(1)

结果:

   cur                                   other_yrs  count
1   11                                    [11, 11]      2
2   12  [16, 13, 12, 9, 9, 6, 6, 3, 3, 3, 2, 1, 0]     11
4   16                                    [15, 85]      1
5   17                                [17, 17, 16]      3
6   13                                      [8, 8]      2

答案 1 :(得分:2)

您可以考虑explode并进行比较,然后在级别= 0上分组并求和:

u = df.explode('other_yrs')
df['Count'] = u['cur'].ge(u['other_yrs']).sum(level=0).astype(int)

print(df)
    cur                                   other_yrs  Count
1   11                                    [11, 11]      2
2   12  [16, 13, 12, 9, 9, 6, 6, 3, 3, 3, 2, 1, 0]     11
4   16                                    [15, 85]      1
5   17                                [17, 17, 16]      3
6   13                                      [8, 8]      2