Question

我想匹配两个字符串列，并计算（完全）匹配项的数量。例如，有两列，例如

index   col1   col2
0       aa     ji
1       bs     aa
2       qe     bs
3       gd     aa

col1由唯一的ID组成。我想计算col1中每个元素在col2中出现了多少次。换句话说，我希望得到如下输出：

col3
2
1
0
0

在上面的示例中。

我已经使用pandas str.contains（）和for循环尝试了上面的工作，但是鉴于大量的观察，它似乎太慢且效率低下。我的代码如下所示。

num = []
for i in range(len(col1)):
    count = col2.str.contains(col1[i]).sum()
    num_replies.append(count)

有没有一种省时的方法来完成这项工作？

Answer 1

使用map和value_count：

df['col3'] = df['col1'].map(df['col2'].value_counts()).fillna(0)

输出：

   index col1 col2  col3
0      0   aa   ji   2.0
1      1   bs   aa   1.0
2      2   qe   bs   0.0
3      3   gd   aa   0.0

Answer 2

尝试：-

df['counts'] = df.col1.apply(lambda x: list(df.col2.values).count(x))