我有一个大熊猫系列列表,其中包含单词集合。我正在尝试查找每个列表中特定单词的频率。例如, 该系列是
0 [All, of, my, kids, have, cried, nonstop, when...
1 [We, wanted, to, get, something, to, keep, tra...
2 [My, daughter, had, her, 1st, baby, over, a, y...
3 [One, of, babys, first, and, favorite, books, ...
4 [Very, cute, interactive, book, My, son, loves...
我想得到每一行中的孩子数量。我试过了
series.count('kids')
这给了我一个错误,说'等级孩子必须和名字一样(无)'
series.str.count('kids)
给我NaN值。
我该如何计算?
答案 0 :(得分:2)
使用
In [5288]: series.apply(lambda x: x.count('kids'))
Out[5288]:
0 1
1 0
2 0
3 0
4 0
Name: s, dtype: int64
详细
In [5292]: series
Out[5292]:
0 [All, of, my, kids, have, cried, nonstop, when]
1 [We, wanted, to, get, something, to, keep, tra]
2 [My, daughter, had, her, 1st, baby, over, a, y]
3 [One, of, babys, first, and, favorite, books]
4 [Very, cute, interactive, book, My, son, loves]
Name: s, dtype: object
In [5293]: type(series)
Out[5293]: pandas.core.series.Series
In [5294]: type(series[0])
Out[5294]: list
答案 1 :(得分:1)
在原始系列中,使用str.findall
+ str.len
:
print(series)
0 All of my kids have cried nonstop when
1 We wanted to get something to keep tra
2 My daughter had her 1st baby over a y
3 One of babys first and favorite books
4 Very cute interactive book My son loves
print(series.str.findall(r'\bkids\b'))
0 [kids]
1 []
2 []
3 []
4 []
dtype: object
counts = series.str.findall(r'\bkids\b').str.len()
print(counts)
0 1
1 0
2 0
3 0
4 0
dtype: int64