在Pandas中,如何获取包含列表的Series的value_counts()

时间:2017-06-22 14:40:40

标签: python pandas

我有一个pandas系列df.files,如下所示:

In [79]: df.files
Out[79]:
0        [{'url': 'http://www.apkmirror.com/wp-content/...
1        [{'url': 'http://www.apkmirror.com/wp-content/...
2        [{'url': 'http://www.apkmirror.com/wp-content/...
3        [{'url': 'http://www.apkmirror.com/wp-content/...
4        [{'url': 'http://www.apkmirror.com/wp-content/...
5        [{'url': 'http://www.apkmirror.com/wp-content/...
6        [{'url': 'http://www.apkmirror.com/wp-content/...
7        [{'url': 'http://www.apkmirror.com/wp-content/...
8        [{'url': 'http://www.apkmirror.com/wp-content/...
9        [{'url': 'http://www.apkmirror.com/wp-content/...
10       [{'url': 'http://www.apkmirror.com/wp-content/...
11       [{'url': 'http://www.apkmirror.com/wp-content/...
12       [{'url': 'http://www.apkmirror.com/wp-content/...
13       [{'url': 'http://www.apkmirror.com/wp-content/...
14       [{'url': 'http://www.apkmirror.com/wp-content/...
15       [{'url': 'http://www.apkmirror.com/wp-content/...
16       [{'url': 'http://www.apkmirror.com/wp-content/...
17       [{'url': 'http://www.apkmirror.com/wp-content/...
18       [{'url': 'http://www.apkmirror.com/wp-content/...
19       [{'url': 'http://www.apkmirror.com/wp-content/...
20       [{'url': 'http://www.apkmirror.com/wp-content/...
21       [{'url': 'http://www.apkmirror.com/wp-content/...
22       [{'url': 'http://www.apkmirror.com/wp-content/...
23       [{'url': 'http://www.apkmirror.com/wp-content/...
24       [{'url': 'http://www.apkmirror.com/wp-content/...
25       [{'url': 'http://www.apkmirror.com/wp-content/...
26       [{'url': 'http://www.apkmirror.com/wp-content/...
27       [{'url': 'http://www.apkmirror.com/wp-content/...
28       [{'url': 'http://www.apkmirror.com/wp-content/...
29       [{'url': 'http://www.apkmirror.com/wp-content/...
                               ...                        
16487    [{'url': 'http://www.apkmirror.com/wp-content/...
16488                                                   []
16489    [{'url': 'http://www.apkmirror.com/wp-content/...
16490    [{'url': 'http://www.apkmirror.com/wp-content/...
16491                                                   []
16492    [{'url': 'http://www.apkmirror.com/wp-content/...
16493    [{'url': 'http://www.apkmirror.com/wp-content/...
16494    [{'url': 'http://www.apkmirror.com/wp-content/...
16495                                                   []
16496                                                   []
16497                                                   []
16498    [{'url': 'http://www.apkmirror.com/wp-content/...
16499    [{'url': 'http://www.apkmirror.com/wp-content/...
16500    [{'url': 'http://www.apkmirror.com/wp-content/...
16501    [{'url': 'http://www.apkmirror.com/wp-content/...
16502    [{'url': 'http://www.apkmirror.com/wp-content/...
16503                                                   []
16504                                                   []
16505                                                   []
16506                                                   []
16507                                                   []
16508                                                   []
16509                                                   []
16510                                                   []
16511                                                   []
16512                                                   []
16513                                                   []
16514                                                   []
16515                                                   []
16516                                                   []

某些值是空列表,而其他值是包含单个字典的列表,其格式类似于以下内容:

In [80]: df.files.loc[0]
Out[80]: 
[{'checksum': '9f6075f4c561792e48354277b46a6810',
  'path': 'full/80832b9fca82ce0f58f4d23c511e5a1d657c40e8.php?id=2968',
  'url': 'http://www.apkmirror.com/wp-content/themes/APKMirror/download.php?id=2968'}]

我想知道df.files中有多少条目实际上是空列表。但是,如果我尝试df.files.value_counts(),我会得到TypeError: unhashable type: 'list'。我怎么能解决这个问题?

3 个答案:

答案 0 :(得分:4)

如果您想使用value_counts,则可以先转换为tuple

vc = df.files.apply(tuple).value_counts()

但是,如果只需lengthlists使用str.len来计算lists,那么sum所有True的布尔掩码:

l = (df['files'].str.len() == 0).sum()

如果无法使用NaN s值,请使用IanS solution

l = (df['files'].apply(len) == 0).sum()

答案 1 :(得分:2)

如果您要查找空列表,为什么要使用value_counts?

len([i for i in df.files if len(i) == 0])

答案 2 :(得分:0)

您也可以编写for循环来遍历列表:

/<@?.*>/