Question

我要计算包含长字符串的列中子字符串列表的出现次数，并在pandas df中创建一个计数列

      Input:          
     ID    History

     1     USA|UK|IND|DEN|MAL|SWE|AUS
     2     USA|UK|PAK|NOR
     3     NOR|NZE
     4     IND|PAK|NOR

       lst=['USA','IND','DEN']


     Output :
     ID    History                      Count

     1     USA|UK|IND|DEN|MAL|SWE|AUS    3
     2     USA|UK|PAK|NOR                1
     3     NOR|NZE                       0
     4     IND|PAK|NOR                   1

Answer 1

这是str.count

的一种方法

df1.History.str.count('|'.join(lst))
Out[316]: 
0    3
1    1
2    0
3    1
Name: History, dtype: int64

#df1['Count']= df1.History.str.count('|'.join(lst))

Answer 2

使用lambda：

df.History.apply(lambda x: len([i for i in x.split("|") if i in lst]))

结果

计算pandas df col中子字符串列表的出现次数

2 个答案: